1:"a"
<a class="pr10 fz14" href="/Html/site_dict.cn.html" title="海词词典" target="_blank">海词词典</a>
# -*- coding: UTF-8 -*- from urllib.request import urlopen from bs4 import BeautifulSoup html = urlopen("http://top.chinaz.com/all/index_2.html") bsObj = BeautifulSoup(html) #根据css样式表查找 #<strong class="col-red02">32</strong> nameList = bsObj.findAll("a",{"class":"pr10 fz14"}) for name in nameList: print(name.get_text()) #get_text()能去除标签
运行程序结果为:
海词词典
太平洋亲子网
携程旅行网
太平洋汽车网
蚂蜂窝
央视网
站长之家
39健康网
2345网址导航
2:strong
<strong class="col-red02">31</strong>
# -*- coding: UTF-8 -*- from urllib.request import urlopen from bs4 import BeautifulSoup html = urlopen("http://top.chinaz.com/all/index_2.html") bsObj = BeautifulSoup(html) #根据css样式表查找 #<strong class="col-red02">32</strong> nameList = bsObj.findAll("strong",{"class":"col-red02"}) for name in nameList: print(name.get_text()) #get_text()能去除标签
运行程序结果为:
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60