selenium+chromedriver安装及简单使用

最新推荐文章于 2025-11-24 18:24:14 发布

原创最新推荐文章于 2025-11-24 18:24:14 发布 · 1.7k 阅读

0 ·

CC 4.0 BY-SA版权

python 专栏收录该内容

2 篇文章

订阅专栏

本文介绍了如何安装selenium和chromedriver，包括通过pip安装selenium，从官方下载并配置chromedriver，以及如何设置环境变量。最后，展示了利用selenium爬取火猫TV主播节目的基础用法。

部署运行你感兴趣的模型镜像

一. 安装 selenium

pip install selenium

二. 安装 chromedriver

https://sites.google.com/a/chromium.org/chromedriver/downloads

解压后里面是一个exe文件，有两种选择：

1.每次使用手动增加路径：

#手动添加路径
path = "C:\Program Files (x86)\Google\Chrome\Application\chromedriver\chromedriver.exe"
driver = webdriver.Chrome(executable_path=path)

2. 添加环境变量path，增加chromedriver所在的目录

三. 简单实现，爬取火猫TV的主播节目数据

from selenium import webdriver
from bs4 import BeautifulSoup
from pandas import DataFrame
import time

#手动添加路径
path = "C:\Program Files (x86)\Google\Chrome\Application\chromedriver\chromedriver.exe"
driver = webdriver.Chrome(executable_path=path)

url = "https://www.huomao.com/channel/lol"

#司机开车了
driver.get(url) 

#让页面移到最下面点击加载，连续6次，司机会自动更新！！
for i in range (6):
    driver.find_element_by_id("getmore").click()
    time.sleep(1)

#开始解析   
soup = BeautifulSoup(driver.page_source,"html.parser")
page_all = soup.find("div",attrs={"id":"channellist"})
pages = page_all.find_all("div",attrs={"class":"list-smallbox"})

name =[]
title =[]
watching =[]

for page in pages:
    tag = False
    try:    
        this_title = page.find("div",attrs={"class":"title-box"}).find("em").string.strip()
        temp = page.find_all("p")
        this_name = temp[1].find("span").string.strip()
        this_watching = temp[1].find_all("span")[1].string.strip()
        tag = True
        if tag:
            title.append(this_title)
            name.append(this_name)
            watching.append(this_watching)
    except:
        continue        
result = DataFrame({
        "主播名":name,
        "节目名":title,
        "在线观看人数":watching
        })
    
#没有文件会自动创建
result.to_excel("E:\\resultLol.xlsx",sheet_name = "Sheet1")

您可能感兴趣的与本文相关的镜像