Python实战计划学习作业1-2

最新推荐文章于 2024-12-12 09:30:00 发布

原创最新推荐文章于 2024-12-12 09:30:00 发布 · 548 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#python #html

Python 专栏收录该内容

22 篇文章

订阅专栏

本文介绍了在Python中使用BeautifulSoup库进行HTML解析时，如何利用find和find_all函数查找元素。通过实例展示了在找到特定区块后，如何用for循环深入子块进行进一步的查找操作。

代码如下

from bs4 import BeautifulSoup

html_path = "/Users/reed/Documents/dev/Plan-for-combating/week1/1_2/1_2answer_of_homework/index.html"

with open(html_path, 'r') as wb_data:
    soup = BeautifulSoup(wb_data, "lxml")
    images = soup.select("body > div > div > div.col-md-9 > div > div > div > img")
    names = soup.select("body > div > div > div.col-md-9 > div > div > div > div.caption > h4 > a")
    prices = soup.select("body > div > div > div.col-md-9 > div > div > div > div.caption > h4.pull-right")
    reviews = soup.select("body > div > div > div.col-md-9 > div > div > div > div.ratings > p.pull-right")
    stars_block = soup.find_all("div", class_="ratings")

for name, image, price, review, star_block in zip(names, images, prices, reviews, stars_block):
    stars_num = len(star_block.find_all("span", class_="glyphicon glyphicon-star"))
    print(name.get_text(), "\n    ", image.get('src'), "\n    ", price.get_text(), "\n    ", review.get_text(),
          "\n    ", stars_num, " stars\n")

输出结果

/Library/Frameworks/Python.framework/Versions/3.5/bin/python3.5 /Users/reed/PycharmProjects/web01/web_parse2.py
EarPod 
     img/pic_0000_073a9256d9624c92a05dc680fc28865f.jpg 
     $24.99 
     65 reviews 
     5  stars

New Pocket 
     img/pic_0005_828148335519990171_c234285520ff.jpg 
     $64.99 
     12 reviews 
     4  stars

New sunglasses 
     img/pic_0006_949802399717918904_339a16e02268.jpg 
     $74.99 
     31 reviews 
     4  stars

Art Cup 
     img/pic_0008_975641865984412951_ade7a767cfc8.jpg 
     $84.99 
     6 reviews 
     3  stars

iphone gamepad 
     img/pic_0001_160243060888837960_1c3bcd26f5fe.jpg 
     $94.99 
     18 reviews 
     4  stars

Best Bed 
     img/pic_0002_556261037783915561_bf22b24b9e4e.jpg 
     $214.5 
     18 reviews 
     4  stars

iWatch 
     img/pic_0011_1032030741401174813_4e43d182fce7.jpg 
     $500 
     35 reviews 
     4  stars

Park tickets 
     img/pic_0010_1027323963916688311_09cc2d7648d9.jpg 
     $15.5 
     8 reviews 
     4  stars


Process finished with exit code 0