BeautifulSoup 库是一个非常流行的的Python中的中,模块通过BeautifulSoup 库可以轻松的解析请求库请求的| |网页,并| | |网页把源代码解析为汤文档,以便过滤提取数据。
import requests
from bs4 import BeautifulSoup
headers ={
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/71.0.3573.0 Safari/537.36'
}
res =requests.get('https://www.baidu.com/',headers=headers)
soup = BeautifulSoup(res.text,'lxml')
print(soup.prettify())