先说明,我也是新手。我也是昨晚突然有兴趣才看的爬虫。我是在知乎找的教程。改动很少(有一句扑街了,我改了)。
主要是想记录理解的东西。Show the Code:
import requests
from bs4 import BeautifulSoup
comments = []
r = requests.get('http://tieba.baidu.com/f?kw=%E5%8D%8E%E5%8D%97%E5%86%9C%E4%B8%9A%E5%A4%A7%E5%AD%A6&fr=index&red_tag=y3160164477')
soup = BeautifulSoup(r.content, 'lxml')
Tags = soup.find_all('li', attrs={
"class": ' j_thread_list clearfix'})
for li in Tags:
comment = {}
b = li.find('span', attrs={
"class": "frs-author-name-wrap"})
comment["author"] = b.text.strip()
a = li.find('a', attrs={
"class": "j_th_tit "})
comment["title"] = a.text.strip()
c = li.find('div', attrs={
"class": "threadlist_abs threadlist_abs_onlyline "})
comment["read"] = c.text.strip()
d = li.find('span', attrs={
"class":