最简单的方法解决安装bs4的lxml解析器pip install lxml安装报错失败问题

最新推荐文章于 2023-10-13 17:18:25 发布

原创最新推荐文章于 2023-10-13 17:18:25 发布 · 7k 阅读

14 ·

CC 4.0 BY-SA版权

python分享专栏收录该内容

71 篇文章

订阅专栏

本文介绍如何解决bs4及lxml在Python环境中安装时出现的问题，包括使用离线包进行安装的具体步骤，并展示了安装成功后的简单示例。

最近使用爬虫需要用到bs4，无论是框架scrapy，还是requests请求后解析。都需要使用html解析库。

当然正则是可以代替一部分搜索。html解析是必不可少的。

网上推荐 lxml的比较多，优点：稳定，高效

bs4文档也是非常推荐使用

我电脑上面共存python2/3环境 pip3是python3的安装第三库

我在安装的时候遇到安装卡死或者报错的问题，看了网上教程五花八门。

我是尝试了安装离线包的方法，也可以成功安装的。

通过以下网址下载自己对于的whl安装包。

https://www.lfd.uci.edu/~gohlke/pythonlibs/#lxml

找到需要下载的包名

点击跳到

下载后使用管理员身份打开cmd.exe。注意必须需要管理员，否则容易会权限问题导致安装失败

管理员打开会有administrator的标识

命令：cd 文件保存的路径，

cd到文件目录使用pip3 install 文件全名回车即可

使用pip3 list命令看是否安装成功

from bs4 import BeautifulSoup

html_doc = """
<html><head><title>The Dormouse's story</title></head>
<body>
<p class="title"><b>The Dormouse's story</b></p>

<p class="story">Once upon a time there were three little sisters; and their names were
<a href="http://example.com/elsie" class="sister" id="link1">Elsie</a>,
<a href="http://example.com/lacie" class="sister" id="link2">Lacie</a> and
<a href="http://example.com/tillie" class="sister" id="link3">Tillie</a>;
and they lived at the bottom of a well.</p>

<p class="story">...</p>
"""

soup = BeautifulSoup(html_doc, 'lxml')

print(soup)

执行结果