以下是一些 Python 数据抓取的案例:
案例一:抓取天气预报数据
使用 Python 抓取天气预报网站的数据,例如从中国天气网抓取某个城市的天气预报信息。
python格式:
import requests
from bs4 import BeautifulSoup
url = 'https://weather.com/city/your_city_code'
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')
temperature = soup.find('span', class_='temperature-value').text
description = soup.find('p', class_='description').text
print(f"温度:{temperature},天气描述:{description}")
案例二:抓取电商产品信息
从电商平台抓取产品的名称、价格、评价等信息。
python格式:
import requests
from bs4 import BeautifulSoup
product_url = 'https://www.ecommerce_site.com/product/12345'
response = requests.get(product_url)
soup = BeautifulSoup(response.text, 'html.parser')
product_name = soup.find('h1', class_='product-title').text
price = soup.find('span', class_='price-value').text
reviews = [review.text for review in soup.find_all('p', class_='review-text')]
print(f"产品名称:{product_name},价格:{price},评价:{reviews}")
案例三:抓取新闻文章
从新闻网站抓取新闻文章的标题、内容、发布时间等。
python格式:
import requests
from bs4 import BeautifulSoup
news_url = 'https://news_site.com/article/6789'
response = requests.get(news_url)
soup = BeautifulSoup(response.text, 'html.parser')
title = soup.find('h1', class_='article-title').text
content = soup.find('div', class_='article-content').text
publish_time = soup.find('span', class_='publish-time').text
print(f"标题:{title},内容:{content},发布时间:{publish_time}")
案例四:抓取股票数据
从金融网站抓取股票的价格、涨跌幅等信息。
python格式:
import requests
from bs4 import BeautifulSoup
stock_url = 'https://finance_site.com/stock/your_stock_code'
response = requests.get)
soup = BeautifulSoup(response.text, 'html.parser')
stock_price = soup.find('span', class_='stock-price-value').text
change_percentage = soup.find('span', class_='change-percentage').text
print(f"股票价格:{stock_price},涨跌幅:{change_percentage}")
这些案例可以帮助你了解如何使用 Python 进行数据抓取,但在实际操作中,要注意遵守网站的使用条款和法律法规,避免过度抓取对网站造成负担。