如果有网站返回如下:
Content-Type: text/html
Connection: close
Date: Sat, 30 Jul 2011 18:06:13 GMT
Server: SWS
Vary: Accept-Encoding,X-Up-Calling-Line-id,X-Source-ID,X-Up-Bearer-Type
Cache-Control: max-age=70
Expires: Sat, 30 Jul 2011 18:07:23 GMT
Last-Modified: Sat, 30 Jul 2011 18:05:22 GMT
Content-Encoding: gzip
Content-Length: 70442
FSS-Cache: HIT from 31589010.39519058.42621963
那么你可以这样做:
import urllib.request as ur
import gzip
w=ur.urlopen('http://www.sohu.com').read()
contents=gzip.decompress(w).decode('gbk')
print (contents)