问题
Traceback (most recent call last):
File "/Users/pasca/Desktop/github/test_spider/zhihu_img_spider.py", line 55, in <module>
data = get_page(url)
File "/Users/pasca/Desktop/github/test_spider/zhihu_img_spider.py", line 31, in get_page
data = response.json()
File "/Users/pasca/anaconda3/lib/python3.7/site-packages/requests/models.py", line 897, in json
return complexjson.loads(self.text, **kwargs)
File "/Users/pasca/anaconda3/lib/python3.7/json/__init__.py", line 348, in loads
return _default_decoder.decode(s)
File "/Users/pasca/anaconda3/lib/python3.7/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/Users/pasca/anaconda3/lib/python3.7/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)
搜索后发现是 json 格式的问题,无法直接响应response.json()。
解决方法
比较坑爹的一点,原因header加多了一个东西
'accept-encoding': 'gzip, deflate, br',
response headers中的content-encoding
request headers 中的accept-encoding
content-encoding是指网页使用了哪种压缩方式传输数据给你
accept-encoding表示你发送请求时告诉服务器,我解压这些格式的数据。