webDriver timeout waiting for driver...

本文详细记录了在Docker容器环境下使用PhantomJS时遇到的启动超时问题,并分享了解决方案,即安装PhantomJS的隐藏依赖库,包括libfontconfig1、fontconfig和libfreetype6-dev等。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

最近使用phantomjs时,遇到如下问题

java.lang.ExceptionInInitializerError
    at be.axians.actemium.milter.helper.BrowserTestParent.openBrowser(BrowserTestParent.java:32)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

    ...

    at org.gradle.internal.concurrent.ThreadFactoryImpl$ManagedThreadRunnable.run(ThreadFactoryImpl.java:55)
    at java.lang.Thread.run(Thread.java:748)
Caused by: org.openqa.selenium.WebDriverException: Timed out waiting for driver server to start.
Build info: version: '3.4.0', revision: 'unknown', time: 'unknown'
System info: host: 'master', ip: '172.17.0.2', os.name: 'Linux', os.arch: 'amd64', os.version: '3.10.0-514.26.1.el7.x86_64', java.version: '1.8.0_191'
Driver info: driver.version: PhantomJSDriver
    at org.openqa.selenium.remote.service.DriverService.waitUntilAvailable(DriverService.java:193)
    at org.openqa.selenium.remote.service.DriverService.start(DriverService.java:181)
    at org.openqa.selenium.phantomjs.PhantomJSCommandExecutor.execute(PhantomJSCommandExecutor.java:78)
    at org.openqa.selenium.remote.RemoteWebDriver.execute(RemoteWebDriver.java:637)
    at org.openqa.selenium.remote.RemoteWebDriver.startSession(RemoteWebDriver.java:250)
    at org.openqa.selenium.remote.RemoteWebDriver.startSession(RemoteWebDriver.java:236)
    at org.openqa.selenium.remote.RemoteWebDriver.<init>(RemoteWebDriver.java:137)
    at org.openqa.selenium.phantomjs.PhantomJSDriver.<init>(PhantomJSDriver.java:116)
    at be.axians.actemium.milter.helper.WebDriverManager.<clinit>(WebDriverManager.java:53)
    ... 52 more
Caused by: org.openqa.selenium.net.UrlChecker$TimeoutException: Timed out waiting for [http://localhost:7465/status] to be available after 20001 ms
    at org.openqa.selenium.net.UrlChecker.waitUntilAvailable(UrlChecker.java:107)
    at org.openqa.selenium.remote.service.DriverService.waitUntilAvailable(DriverService.java:190)
    ... 60 more
Caused by: com.google.common.util.concurrent.UncheckedTimeoutException: java.util.concurrent.TimeoutException
    at com.google.common.util.concurrent.SimpleTimeLimiter.callWithTimeout(SimpleTimeLimiter.java:140)
    at org.openqa.selenium.net.UrlChecker.waitUntilAvailable(UrlChecker.java:80)
    ... 61 more
Caused by: java.util.concurrent.TimeoutException
    at java.util.concurrent.FutureTask.get(FutureTask.java:205)
    at com.google.common.util.concurrent.SimpleTimeLimiter.callWithTimeout(SimpleTimeLimiter.java:128)
    ... 62 more

本部分代码在Windows下运行正常,mac下运行也是正常的.只有在docker container上出现问题.

经过搜索 解决.
解决方案如下

yum install -y libfontconfig1 fontconfig libfontconfig1-dev libfreetype6-dev

安装了phantomjs 的一个隐藏依赖.

转载于:https://blog.51cto.com/8745668/2310507

from selenium import webdriver from selenium.webdriver.edge.service import Service from selenium.webdriver.edge.options import Options from selenium.webdriver.common.by import By import time import pandas as pd dataframe_list = [] all_url = [] all_page_text = [] # Setup Edge WebDriver edge_options = Options() edge_options.add_argument("--headless") # Optional: Run in headless mode edge_options.add_argument("--disable-gpu") # Optional: Disable GPU acceleration service = Service('./msedgedriver.exe') # Update the path to your msedgedriver driver = webdriver.Edge(service=service, options=edge_options) # Open the URL read_path = r"D:\\pycharm\\PyCharm Community Edition 2024.1.4\\project\\pythonProject1\\.venv\\Scripts\\智慧农业200条.csv" try: df = pd.read_excel(read_path, engine='openpyxl') except Exception as e: print(f"Error reading Excel file: {e}") driver.quit() exit() for index, row in df.iterrows(): if index == 0: continue # Skip header row if it exists url = row[1] # Adjust column index as needed print(f"Navigating to link for row {index + 1}: {url}") driver.get(url) # Wait for the page to fully load time.sleep(5) # Adjust time as needed based on page load times # Get the entire page text try: page_text = driver.find_element(By.XPATH, '//*[@id="root"]/div[5]/div[1]/div[1]').text print(f"Page text for {url}:") print(page_text) # You can also save this to a file or process it further all_url.append(url) all_page_text.append(page_text) except Exception as e: print(f"Error getting page text for {url}: {e}") # Close the WebDriver driver.quit() new_df = pd.DataFrame({'url':all_url,'text':all_page_text}) merged_page_text = '\n'.join(all_page_text) print(new_df) new_df.to_csv('data.csv',index=False, encoding='utf_8_sig') print('数据爬取完毕') 请进行修改代码
03-10
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值