python 抓取网页数据基于BeautifulSoup

最新推荐文章于 2025-03-02 22:05:13 发布

Mtkgys

最新推荐文章于 2025-03-02 22:05:13 发布

阅读量233

点赞数

分类专栏： Python

本文链接：https://blog.youkuaiyun.com/wh510856826/article/details/115631896

版权

Python 专栏收录该内容

5 篇文章

订阅专栏

该博客介绍了如何利用BeautifulSoup库从指定网址抓取彩票历史数据，包括期号、红球号码和篮球号码，并将数据存储到数据库中。通过批量插入的方式提高了数据处理效率，适合大数据量的情况。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

使用BeautifulSoup 以抓取彩票往期数据为例

导入模块

安装模块  pip install bs4

导入  from bs4 import BeautifulSoup

在这里插入图片描述

# 获取页面内容
data = requests.get('http://datachart.500.com/ssq/history/newinc/history.php?limit=200&sort=0')
# 创建解析对象
html = BeautifulSoup(data.text,'html.parser')
list =[]
# 获取所有满足条件的值  返回list     
for tag in html.find_all('tr',class_='t_tr1'):  
    qihao = tag.find('td').get_text()  #期号   获取第一个td标签
    honqiu = '' #红球
	# 获取所有号码
    for haoma in tag.find_all('td',class_='t_cfont2'):  
        honqiu += haoma.get_text()+','   # 这里我拼接成string 
    honqiu = honqiu[0:-1:]  # 切割
    lanqiu = tag.find('td',class_='t_cfont4').get_text()  #篮球 
    list.append(Haoma(qihao=qihao,number=honqiu,lan=lanqiu))
	# 这里写了两种插入方式  迭代插入  批量插入   
    # Haoma.objects.bulk_create(qihao=qihao,number=honqiu,lan=lanqiu)  
    # hao.save()
# 批量插入 
Haoma.objects.bulk_create(list)