Python实战计划学习作业1-2

本文介绍了在Python中使用BeautifulSoup库进行HTML解析时,如何利用find和find_all函数查找元素。通过实例展示了在找到特定区块后,如何用for循环深入子块进行进一步的查找操作。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

代码如下

from bs4 import BeautifulSoup

html_path = "/Users/reed/Documents/dev/Plan-for-combating/week1/1_2/1_2answer_of_homework/index.html"

with open(html_path, 'r') as wb_data:
    soup = BeautifulSoup(wb_data, "lxml")
    images = soup.select("body > div > div > div.col-md-9 > div > div > div > img")
    names = soup.select("body > div > div > div.col-md-9 > div > div > div > div.caption > h4 > a")
    prices = soup.select("body > div > div > div.col-md-9 > div > div > div > div.caption > h4.pull-right")
    reviews = soup.select("body > div > div > div.col-md-9 > div > div > div > div.ratings > p.pull-right")
    stars_block = soup.find_all("div", class_="ratings")

for name, image, price, review, star_block in zip(names, images, prices, reviews, stars_block):
    stars_num = len(star_block.find_all("span", class_="glyphicon glyphicon-star"))
    print(name.get_text(), "\n    ", image.get('src'), "\n    ", price.get_text(), "\n    ", review.get_text(),
          "\n    ", stars_num, " stars\n")

输出结果

/Library/Frameworks/Python.framework/Versions/3.5/bin/python3.5 /Users/reed/PycharmProjects/web01/web_parse2.py
EarPod 
     img/pic_0000_073a9256d9624c92a05dc680fc28865f.jpg 
     $24.99 
     65 reviews 
     5  stars

New Pocket 
     img/pic_0005_828148335519990171_c234285520ff.jpg 
     $64.99 
     12 reviews 
     4  stars

New sunglasses 
     img/pic_0006_949802399717918904_339a16e02268.jpg 
     $74.99 
     31 reviews 
     4  stars

Art Cup 
     img/pic_0008_975641865984412951_ade7a767cfc8.jpg 
     $84.99 
     6 reviews 
     3  stars

iphone gamepad 
     img/pic_0001_160243060888837960_1c3bcd26f5fe.jpg 
     $94.99 
     18 reviews 
     4  stars

Best Bed 
     img/pic_0002_556261037783915561_bf22b24b9e4e.jpg 
     $214.5 
     18 reviews 
     4  stars

iWatch 
     img/pic_0011_1032030741401174813_4e43d182fce7.jpg 
     $500 
     35 reviews 
     4  stars

Park tickets 
     img/pic_0010_1027323963916688311_09cc2d7648d9.jpg 
     $15.5 
     8 reviews 
     4  stars


Process finished with exit code 0

总结,学习BeautifulSoup里的find和find_all函数,非常好用,再使用find_all后获取一个特定区块的html代码后,可以使用for in循环再次进入子块进行find_all查找。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值