start_url 应该是应该列表,不能是元组!
2018-06-11 16:01:13 [scrapy.core.engine] INFO: Spider opened
2018-06-11 16:01:13 [scrapy.extensions.logstats] INFO: Crawled 0 pages (at 0 pages/min), scraped 0 items (at 0 items/min)
2018-06-11 16:01:13 [scrapy.extensions.telnet] DEBUG: Telnet console listening on 127.0.0.1:6023
2018-06-11 16:01:13 [scrapy.core.engine] ERROR: Error while obtaining start requests
Traceback (most recent call last):
File "D:\PyProject\venv\lib\site-packages\scrapy\core\engine.py", line 127, in _next_request
request = next(slot.start_requests)
File "D:\PyProject\venv\lib\site-packages\scrapy\spiders\__init__.py", line 83, in start_requests
yield Request(url, dont_filter=True)
File "D:\PyProject\venv\lib\site-packages\scrapy\http\request\__init__.py", line 25, in __init__
self._set_url(url)
File "D:\PyProject\venv\lib\site-packages\scrapy\http\request\__init__.py", line 62, in _set_url
raise ValueError('Missing scheme in request url: %s' % self._url)
ValueError: Missing scheme in request url: h
2018-06-11 16:01:13 [scrapy.core.engine] INFO: Closing spider (finished)
修复Scrapy中ValueError: Missing scheme in request url错误
本文记录了在使用Scrapy爬虫时遇到的ValueError问题,原因是start_url设置为元组而非列表导致。解决方法是确保start_url是一个包含完整URL的列表。
841

被折叠的 条评论
为什么被折叠?



