解决使用requests_html模块,req.html.render()下载chromium速度慢问题

本文详细介绍了在使用requests_html库时遇到的chromium渲染问题及其解决方案。通过手动下载并配置chromium路径,成功解决了初次使用render函数时的速度缓慢问题。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

1.第一步,代码如下:

from requests_html import HTMLSession

url="https://www.baidu.com/"

headers={
"Host": "www.baidu.com",
"Upgrade-Insecure-Requests": "1",
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/73.0.3683.103 Safari/537.36"

}

session=HTMLSession()
req=session.get(url,headers=headers)
req.encoding="utf-8"

req.html.render()
result=req.html.find("a.mnav",first=True)
print(req.status_code)
print(result.text)
print(result.attrs.get('href'))

2.因为是第一次使用render函数,需要安装chromium,无奈速度太慢,等待几分钟,才2%

3.解决步骤如下:

3.1手动下载chromium

https://npm.taobao.org/mirrors/chromium-browser-snapshots/Win_x64/650583/

下载后之后解压。

3.2 requests_html运行chromium的路径究竟是怎么样的?

3.2.1 进入python安装目录下的\Lib\site-packages\pyppeteer目录

笔者的目录是:C:\Users\Ray\AppData\Local\Programs\Python\Python37\Lib\site-packages\pyppeteer

3.2.2 打开chromium_downloader.py文件

找到代码:

chromiumExecutable = {
'linux': DOWNLOADS_FOLDER / REVISION / 'chrome-linux' / 'chrome',
'mac': (DOWNLOADS_FOLDER / REVISION / 'chrome-mac' / 'Chromium.app' /
'Contents' / 'MacOS' / 'Chromium'),
'win32': DOWNLOADS_FOLDER / REVISION / 'chrome-win32' / 'chrome.exe',
'win64': DOWNLOADS_FOLDER / REVISION / 'chrome-win32' / 'chrome.exe',
}

从上面可以看出,win64(笔者的win10 系统是64位的)的chromium路径是:

DOWNLOADS_FOLDER / REVISION / 'chrome-win32' / 'chrome.exe',

那么,DOWNLOADS_FOLDER 和REVISION究竟是什么?

往上面寻找,可以找到以下代码:

DOWNLOADS_FOLDER = Path(__pyppeteer_home__) / 'local-chromium'

REVISION = os.environ.get('PYPPETEER_CHROMIUM_REVISION', __chromium_revision__)

可以使用print函数打印出两个路径,具体代码如下:

from pyppeteer import __chromium_revision__, __pyppeteer_home__

DOWNLOADS_FOLDER = Path(__pyppeteer_home__) / 'local-chromium'

REVISION = os.environ.get('PYPPETEER_CHROMIUM_REVISION', __chromium_revision__)

print(DOWNLOADS_FOLDER)

print(REVISION)

运行py文件,就可以知道两个变量的路径。

 

由上面可以知道:chromium路径是:C:\Users\Ray\AppData\Local\pyppeteer\pyppeteer\local-chromium\575458\chrome-win32\chrome.exe

所以自己建文件夹,然后一直到chrome-win32文件夹,把上面下载的chromium文件,拷贝到此目录下

4.运行第一步的代码,完美打印。

具体灵感来源:https://github.com/GoogleChrome/puppeteer/issues/1597

 

转载于:https://www.cnblogs.com/xiaoaiyiwan/p/10776493.html

解释让文件路径的方法正常运行:package main import "github.com/gin-gonic/gin" func main() { r := gin.Default() //r.LoadHTMLGlob("templates/*")//这样引用,需在相对路径下(终端中调至) //r.LoadHTMLGlob("response/templates/*") //可直接运行 //文件路径,加载单个 r.LoadHTMLFiles("response/template/index.html") //r.LoadHTMLFile(nse/templates/index.html") r.GET("/", func(c *gin.Context) { c.HTML(200, "index.html", map[string]any{ "title": "hanhanhahahahahha", }) }) r.Run(":8080") } ,运行结果:GOROOT=D:\soft\go\go1.24.5 #gosetup GOPATH=C:\Users\32312\go #gosetup D:\soft\go\go1.24.5\bin\go.exe build -o C:\Users\32312\AppData\Local\JetBrains\GoLand2025.1\tmp\GoLand\___go_build_2_html_go.exe D:\go_project\gin_study\response\2.响应html.go #gosetup C:\Users\32312\AppData\Local\JetBrains\GoLand2025.1\tmp\GoLand\___go_build_2_html_go.exe #gosetup [GIN-debug] [WARNING] Creating an Engine instance with the Logger and Recovery middleware already attached. [GIN-debug] [WARNING] Running in "debug" mode. Switch to "release" mode in production. - using env: export GIN_MODE=release - using code: gin.SetMode(gin.ReleaseMode) [GIN-debug] GET / --> main.main.func1 (3 handlers) [GIN-debug] [WARNING] You trusted all proxies, this is NOT safe. We recommend you to set a value. Please check https://pkg.go.dev/github.com/gin-gonic/gin#readme-don-t-trust-all-proxies for details. [GIN-debug] Listening and serving HTTP on :8080 2025/08/10 16:15:04 [Recovery] 2025/08/10 - 16:15:04 panic recovered: GET / HTTP/1.1 Host: 127.0.0.1:8080 Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7 Accept-Encoding: gzip, deflate, br, zstd Accept-Language: zh-CN,zh;q=0.9,en;q=0.8,en-GB;q=0.7,en-US;q=0.6 Cache-Control: max-age=0 Connection: keep-alive Sec-Ch-Ua: "Not;A=Brand";v="99", "Microsoft Edge";v="139", "Chromium";v="139" Sec-Ch-Ua-Mobile: ?0 Sec-Ch-Ua-Platform: "Windows" Sec-Fetch-Dest: document Sec-Fetch-Mode: navigate Sec-Fetch-Site: none Sec-Fetch-User: ?1 Upgrade-Insecure-Requests: 1 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/139.0.0.0 Safari/537.36 Edg/139.0.0.0 open gin_study/response/templates/index.html: The system cannot find the path specified. D:/soft/go/go1.24.5/src/html/template/template.go:368 (0xcd90fe) Must: panic(err) C:/Users/32312/go/pkg/mod/github.com/gin-gonic/gin@v1.10.1/render/html.go:71 (0xcd8f25) HTMLDebug.loadTemplate: return template.Must(template.New("").Delims(r.Delims.Left, r.Delims.Right).Funcs(r.FuncMap).ParseFiles(r.Files...)) C:/Users/32312/go/pkg/mod/github.com/gin-gonic/gin@v1.10.1/render/html.go:61 (0xcdc077) HTMLDebug.Instance: Template: r.loadTemplate(), C:/Users/32312/go/pkg/mod/github.com/gin-gonic/gin@v1.10.1/context.go:971 (0xd0e1cd) (*Context).HTML: instance := c.engine.HTMLRender.Instance(name, obj) D:/go_project/gin_study/response/2.响应html.go:15 (0xd209ec) main.func1: c.HTML(200, "index.html", map[string]any{ C:/Users/32312/go/pkg/mod/github.com/gin-gonic/gin@v1.10.1/context.go:185 (0xd1758e) (*Context).Next: c.handlers[c.index](c) C:/Users/32312/go/pkg/mod/github.com/gin-gonic/gin@v1.10.1/recovery.go:102 (0xd1757b) CustomRecoveryWithWriter.func1: c.Next() C:/Users/32312/go/pkg/mod/github.com/gin-gonic/gin@v1.10.1/context.go:185 (0xd166c4) (*Context).Next: c.handlers[c.index](c) C:/Users/32312/go/pkg/mod/github.com/gin-gonic/gin@v1.10.1/logger.go:249 (0xd166ab) LoggerWithConfig.func1: c.Next() C:/Users/32312/go/pkg/mod/github.com/gin-gonic/gin@v1.10.1/context.go:185 (0xd15b11) (*Context).Next: c.handlers[c.index](c) C:/Users/32312/go/pkg/mod/github.com/gin-gonic/gin@v1.10.1/gin.go:644 (0xd155a0) (*Engine).handleHTTPRequest: c.Next() C:/Users/32312/go/pkg/mod/github.com/gin-gonic/gin@v1.10.1/gin.go:600 (0xd150c9) (*Engine).ServeHTTP: engine.handleHTTPRequest(c) D:/soft/go/go1.24.5/src/net/http/server.go:3301 (0xaeceed) serverHandler.ServeHTTP: handler.ServeHTTP(rw, req) D:/soft/go/go1.24.5/src/net/http/server.go:2102 (0xaddc64) (*conn).serve: serverHandler{c.server}.ServeHTTP(w, w.req) D:/soft/go/go1.24.5/src/runtime/asm_amd64.s:1700 (0x89b400) goexit: BYTE $0x90 // NOP [GIN] 2025/08/10 - 16:15:04 | 500 | 22.5977ms | 127.0.0.1 | GET "/"
最新发布
08-12
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值