1.错误1:url地址有误
Crawled (200) <GET http://www.itcast.cn/robots.txt> (referer: None)
DEBUG: Crawled (404) <GET http://www.itcast.cn/channel/teacher.shtml/> (referer: None)
解决:复制url的完全地址
start_urls = ['http://www.itcast.cn/channel/teacher.shtml/'] # 刚开始的url
start_urls = ['http://www.itcast.cn/channel/teacher.shtml#ajavaee'] # 改了之后的url
2.错误2:[scrapy.spidermiddlewares.offsite] DEBUG: Filtered offsite request to
解决:dont_filter=True
yield scrapy.Request(
item["s_href"],
callback=self.parse_book_list,
meta={"item": deepcopy(item)},
dont_filter=True
)