问题描述:
当执行scrapy crawl tutu时,报出如下错误:
[root@Uu tutu]# scrapy crawl tutu
Traceback (most recent call last):
File "/usr/bin/scrapy", line 11, in <module>
sys.exit(execute())
File "/usr/lib64/python2.7/site-packages/scrapy/cmdline.py", line 149, in execute
cmd.crawler_process = CrawlerProcess(settings)
File "/usr/lib64/python2.7/site-packages/scrapy/crawler.py", line 249, in __init__
super(CrawlerProcess, self).__init__(settings)
File "/usr/lib64/python2.7/site-packages/scrapy/crawler.py", line 137, in __init__
self.spider_loader = _get_spider_loader(settings)
File "/usr/lib64/python2.7/site-packages/scrapy/crawler.py", line 336, in _get_spider_loader
return loader_cls.from_settings(settings.frozencopy())
File "/usr/lib64/python2.7/site-packages/scrapy/spiderloader.py", line 61, in from_settings
return cls(settings)
File "/usr/lib64/python2.7/site-packages/scrapy/spiderloader.py", line 25, in __init__
self._load_all_spiders()
File "/usr/lib64/python2.7/site-packages/scrapy/spiderloader.py", line 47, in _load_all_spiders
for module in walk_modules(name):
File "/usr/lib64/python2.7/site-packages/scrapy/utils/misc.py", line 71, in walk_modules
submod = import_module(fullpath)
File "/usr/lib64/python2.7/importlib/__init__.py", line 37, in import_module
__import__(name)
File "/home/BS/scrapy/tutu/tutu/spiders/tutu_spider.py", line 12
imgurls = ['imgurl'] = response.css('.post img::attr(src)').extract()
SyntaxError: can't assign to literal
可知,这是一个语法错误:不能指定到文字。
经分析,问题解决如下:
问题解决:
[root@Uu tutu]# cd tutu/spiders/
[root@Uu spiders]# vi tutu_spider.py
import scrapy
from tutu.items import TutuItem
class TutuSpider(scrapy.Spider):
name = 'tutu'
allowed_domains = ['lab.scrapyd.cn']
start_urls = [
'http://lab.scrapyd.cn/archives/55.html'
]
def parse(self , response):
item = TutuItem()
imgurls = response.css(".post img::attr(src)").extr
act()
item['imgurl'] = imgurls
yield item
~
~
~
~
~
"tutu_spider.py" 14L, 345C written
分析得知,是css语句的问题,即,css的括号中要用双引号,不能用单引号。
即,将css语句中括号内的单引号换成双引号即可。