Python Crawler(4)Selenium

最新推荐文章于 2025-09-13 23:06:41 发布

magic_dreamer

最新推荐文章于 2025-09-13 23:06:41 发布

阅读量291

点赞数

CC 4.0 BY-SA版权

分类专栏： Scripts 文章标签： python

本文链接：https://blog.youkuaiyun.com/magic_dreamer/article/details/84908161

Scripts 专栏收录该内容

299 篇文章

订阅专栏

本文介绍了一个简单的Python脚本，该脚本利用Selenium自动化浏览器操作来抓取Walmart网站上的商品链接。通过PhantomJS无头浏览器，脚本能够模拟真实用户的浏览行为，从而获取页面上所有商品标题链接。

Python Crawler(4)Selenium

>pip install selenium

Simple open_url.py
from selenium import webdriver
import os

path2phantom = '/opt/phantomjs/bin/phantomjs'
browser = webdriver.PhantomJS(path2phantom)
browser.get('https://www.walmart.com/search/?grid=false&page=2&query=computer#searchProductResult')

links = browser.find_elements_by_css_selector('a.product-title-link')

count = 0

for link in links:
count = count + 1
print str(count) + ' ' + link.get_attribute('href')

browser.quit()

>python open_url.py
1 https://www.walmart.com/ip/Dell-Inspiron-15-6-Laptop-AMD-A9-8GB-AMD-Radeon-R5-Graphics-1TB-HD-Red/54527141
2 https://www.walmart.com/ip/Dell-Inspiron-15-6-Laptop-AMD-A9-8GB-AMD-Radeon-R5-Graphics-1TB-HD-Red/54527141
3 https://www.walmart.com/ip/Refurbished-HP-15-f387wm-15-6-Laptop-Touchscreen-Windows-10-Home-AMD-Quad-Core-A8-7410-APU-Processor-4GB-RAM-500GB-Hard-Drive/54476821

References:
http://www.marinamele.com/selenium-tutorial-web-scraping-with-selenium-and-python
https://gxnotes.com/article/52544.html
http://selenium-python.readthedocs.io/locating-elements.html
https://selenium-python.readthedocs.io/api.html#selenium.webdriver.remote.webdriver.WebDriver.find_element_by_class_name