Scrapy

最新推荐文章于 2024-06-29 21:24:18 发布

felcon

最新推荐文章于 2024-06-29 21:24:18 发布

阅读量927

点赞数

CC 4.0 BY-SA版权

分类专栏： Scrapy

本文链接：https://blog.youkuaiyun.com/felcon/article/details/46124877

Scrapy 专栏收录该内容

10 篇文章

订阅专栏

本文详细介绍了如何解决在安装和运行Scrapy爬虫时遇到的错误，包括安装依赖库service_identity和pywin32，以及解决DLL加载失败的问题。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

1. 安装

按照官方文档的安装指南，一步步走就行了。我安装在windows下
http://scrapy-chs.readthedocs.org/zh_CN/latest/intro/install.html#windows

2. 初探

还是官方文档，继续
http://scrapy-chs.readthedocs.org/zh_CN/latest/intro/tutorial.html

但是在运行爬虫程序的时候报错了，如下：

E:\Python Workspace\tutorial>scrapy crawl dmoz
:0: UserWarning: You do not have a working installation of the service_identity
module: 'No module named service_identity'.  Please install it from <https://pyp
i.python.org/pypi/service_identity> and make sure all of its dependencies are sa
tisfied.  Without the service_identity module and a recent enough pyOpenSSL to s
upport it, Twisted can perform only rudimentary TLS client hostname verification
.  Many valid certificate/hostname mappings may be rejected.
2015-05-28 11:23:20+0800 [scrapy] INFO: Scrapy 0.24.6 started (bot: tutorial)

根据提示，去下载和安装service_identity,地址为：https://pypi.python.org/pypi/service_identity#downloads，下载whl文件
使用pip安装：pip install service_identity-14.0.0-py2.py3-none-any.whl

再次运行，继续报错：

raise ImportError("Error loading object '%s': %s" % (path, e))
ImportError: Error loading object 'scrapy.core.downloader.handlers.s3.S3Download
Handler': DLL load failed: 找不到指定的模块。

需要安装pywin32，这里有个问题，在官网下载的安装时会报错，我使用了另一个之前下载的，版本号、大小完全一样，但安装没问题。