Read a HTML 5 book

I don't' use HTML for many years. There is a HTML 5 book, I reviewed for several hours and recalled many technical things from my mind.

There are many new thing in the HTML 5, such as paragraph, cite, and term tags.

I suddenly feel, that is why we can not develop a web browser, there are too many things to be explained, every tags, every new standards...

好的,我将分为三个部分来回答你的问题。 第一部分:读取HTML文件,提取书籍信息 推荐使用BeautifulSoup库来解析HTML文件,以下是代码示例: ```python from bs4 import BeautifulSoup # 读取HTML文件 with open('douban_books.html', 'r', encoding='utf-8') as f: html = f.read() # 解析HTML文件 soup = BeautifulSoup(html, 'html.parser') # 查找所有的书籍信息 book_list = soup.find_all('div', class_='info') # 遍历书籍信息,提取标题、作者、出版年份、出版社 for book in book_list: # 提取标题 title = book.find('a').get_text().strip() # 提取作者和出版信息 author_info = book.find('div', class_='pub').get_text().strip() author, pub_info = author_info.split('/') # 提取出版年份和出版社 pub_year, publisher = pub_info.split() # 打印书籍信息 print('书名:', title) print('作者:', author) print('出版年份:', pub_year) print('出版社:', publisher) ``` 第二部分:创建数据库表格 推荐使用sqlite3库来创建数据库和表格,以下是代码示例: ```python import sqlite3 # 连接数据库 conn = sqlite3.connect('book.db') # 创建表格 conn.execute(''' CREATE TABLE book ( id INTEGER PRIMARY KEY AUTOINCREMENT, title TEXT, author TEXT, pub_year TEXT, publisher TEXT ) ''') # 关闭连接 conn.close() ``` 第三部分:将书籍信息存入数据库表格中并打印 以下是代码示例: ```python import sqlite3 from bs4 import BeautifulSoup # 读取HTML文件 with open('douban_books.html', 'r', encoding='utf-8') as f: html = f.read() # 解析HTML文件 soup = BeautifulSoup(html, 'html.parser') # 查找所有的书籍信息 book_list = soup.find_all('div', class_='info') # 连接数据库 conn = sqlite3.connect('book.db') # 将书籍信息存入数据库表格中 for book in book_list: # 提取书籍信息 title = book.find('a').get_text().strip() author_info = book.find('div', class_='pub').get_text().strip() author, pub_info = author_info.split('/') pub_year, publisher = pub_info.split() # 存入数据库 conn.execute(''' INSERT INTO book (title, author, pub_year, publisher) VALUES (?, ?, ?, ?) ''', (title, author, pub_year, publisher)) # 提交更改并关闭连接 conn.commit() conn.close() # 打印书籍信息 conn = sqlite3.connect('book.db') cursor = conn.execute('SELECT * FROM book') for row in cursor: print('书名:', row[1]) print('作者:', row[2]) print('出版年份:', row[3]) print('出版社:', row[4]) print() conn.close() ``` 以上就是完整的代码实现,希望能对你有所帮助!
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值