python爬取新闻标题

最新推荐文章于 2024-04-30 20:47:12 发布

OYQ697

最新推荐文章于 2024-04-30 20:47:12 发布

阅读量6.8k

点赞数 2

CC 4.0 BY-SA版权

分类专栏：简单爬取文章标签： python 爬取数据

本文链接：https://blog.youkuaiyun.com/qq_41777527/article/details/80144728

简单爬取专栏收录该内容

3 篇文章

订阅专栏

1.本文以pycharm为编辑器，爬取搜狐新闻的网页信息

2.具体代码如下

import requests
from bs4 import BeautifulSoup
res =requests.get('http://www.sohu.com/c/8/1460')
#防止中文内容乱码
res.encoding = 'utf-8'
soup = BeautifulSoup(res.text,'html.parser')
#对网页内容进行爬取
#查看网页代码，看标题在哪个位置，本文以搜狐新闻为例，他的标题是在class=news-box里面
for news in soup.select('.news-box'):
    #获取文本标题
    h4 = news.select('h4')[0].text
    #获取连接
    a = news.select('a')[0]['href']
    print(h4,a)