Python 爬取网页数据1

原创于 2024-12-09 22:26:27 发布 · 627 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#python #开发语言

菜鸟学习编程之路同时被 2 个专栏收录

253 篇文章

订阅专栏

Python

13 篇文章

订阅专栏

import requests
from bs4 import BeautifulSoup

url = 'https://so.gushiwen.cn/gushi/tangshi.aspx'
r = requests.get(url)
r.encoding = "utf8"
html = r.text
# print(html)
print(r.status_code)
plist = []
soup = BeautifulSoup(html, 'html.parser')
links = soup.find_all('strong')

i = 0
poetrys = soup.find_all('a')
for poetry in poetrys:
    if not poetry:
        continue
    i += 1
    print('https://so.gushiwen.cn/' + poetry['href'], poetry.text)

print(i)

使用BeautifulSoup库，爬取网页数据