简爬新浪新闻网

最新推荐文章于 2025-08-09 10:17:13 发布

转载最新推荐文章于 2025-08-09 10:17:13 发布 · 64 阅读

0 ·

CC 4.0 BY-SA版权

原文链接：http://www.cnblogs.com/zhangmingzhao/p/7256638.html

文章标签：

#python

需要一个chrome浏览器插件，https://chrome.google.com/webstore/detail/infolite/ipjbadabbpedegielkhgpiekdlmfpgal，安装好后打开，可以看到网页块域的类名称

import requests
from bs4 import BeautifulSoup
res = requests.get('http://news.sina.com.cn/china/')
res.encoding = 'utf-8'
soup = BeautifulSoup(res.text,'html.parser')
for news in soup.select('.news-item'):
    if len(news.select('h2')) > 0:
        time = news.select('.time')[0].text
        h2 = news.select('h2')[0].text
        a = news.select('a')[0]['href']
        print(time,h2,a)

结果：