Scarpy源码分析 17 Requests and Responses Ⅵ（终于肝完）

最新推荐文章于 2025-01-16 01:24:19 发布

原创

最新推荐文章于 2025-01-16 01:24:19 发布 · 1.5k 阅读

0 ·

CC 4.0 BY-SA版权

文章标签：

#python

本文详细介绍了Scrapy框架中的Response对象，包括其构造函数、属性如url、status、headers、body等，以及方法如follow_all()和replace()。Response对象代表HTTP响应，通常由下载器下载后提供给爬虫处理。TextResponse、HtmlResponse和XmlResponse是Response的子类，分别提供了文本编码支持和特定格式的自动检测。follow_all()方法用于跟踪并生成多个Request实例以爬取页面中的链接。

2021SC@SDUSC

最后，作为软工应用这门课的收尾，熬夜硬肝，看完了response部分的代码。结合着官方文档进行分析：

Response objects：

类：scrapy.http.Response(*args, **kwargs) 源码附在最后

A Response object represents an HTTP response, which is usually downloaded (by the Downloader) and fed to the Spiders for processing.

一个 Response 对象代表一个 HTTP 响应，通常被下载（由下载器）并提供给爬虫程序进行处理。

Parameters

url (str) – the URL of this response
status (int) – the HTTP status of the response. Defaults to 200.
headers (dict) – the headers of this response. The dict values can be strings (for single valued headers) or lists (for multi-valued headers).
body (bytes) – the response body. To access the decoded text as a string, use response.text from an encoding-aware Response subclass, such as TextResponse.
flags (list) – is a list containing the initial values for the Response.flags attribute. If given, the list will be shallow copied.
request (scrapy.http.Request) – the initial value of the Response.request attribute. This represents the Request that generated this response.
certificate (twisted.internet.ssl.Certificate) – an objec