Scrapy报错

本文介绍了一次使用Scrapy进行网页数据抓取时遇到的报错问题及其解决过程。错误发生在调用回调函数时,通过修改回调函数的引用方式解决了问题。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

Scrapy抓取数据时报错

Traceback (most recent call last):
  File "C:\software\Python\Python35\lib\site-packages\twisted\internet\defer.py", line 1386, in _inlineCallbacks
  result = g.send(result)
  File "C:\software\Python\Python35\lib\site-packages\scrapy\core\downloader\middleware.py", line 43, in process_request
  defer.returnValue((yield download_func(request=request,spider=spider)))
  File "C:\software\Python\Python35\lib\site-packages\twisted\internet\defer.py", line 1363, in returnValue
  raise _DefGen_Return(val)
twisted.internet.defer._DefGen_Return: <200 http://ios.jobbole.com/all-posts/page/2/>

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "C:\software\Python\Python35\lib\site-packages\scrapy\utils\defer.py", line 45, in mustbe_deferred
  result = f(*args, **kw)
  File "C:\software\Python\Python35\lib\site-packages\scrapy\core\spidermw.py", line 49, in process_spider_input
  return scrape_func(response, request, spider)
  File "C:\software\Python\Python35\lib\site-packages\scrapy\core\scraper.py", line 146, in call_spider
  dfd.addCallbacks(request.callback or spider.parse, request.errback)
  File "C:\software\Python\Python35\lib\site-packages\twisted\internet\defer.py", line 303, in addCallbacks
  assert callable(callback)
AssertionError

思考后,根据assert callable(callback)猜测是调用回调函数时发生了错误。检查源代码
  def parse(self, response):
      selector = Selector(response)
      # 获取文章的链接
      article_urls = selector.xpath('//a[@class="archive-title"]/@href').extract()
      for article_url in article_urls:
          yield Request(url=article_url, callback=self.parse_content)
      # 调用下一页的链接
      next_page_url = selector.xpath('//a[contains(@class, "next")]/@href').extract()
      if next_page_url:
          yield Request(url=next_page_url[0], callback="parse")#self.parse
      else:
          print("已经是最后一页了...........")
由于后面一个函数没有发挥作用,猜测这就是问题所在。所以将callback="parse"改为callback=self.parse后,问题解决



评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值