python模块中urllib和requests发送请求的区别_<bound method response.json of <response [200]>>-优快云博客

本文链接：https://blog.youkuaiyun.com/Rachel0417/article/details/113365364

此次实践中获取网页使用的代码

urllib:

import urllib.request



url=['http://wz.lanzh.95306.cn/mainPageNoticeList.do?method=init&id=2000001&cur=1']

#通常我们爬取网页，在构造http请求的时候，都需要加上一些额外信息，什么Useragent，cookie等之类的信息，或者添加代理服务器。往往这些都是一些必要的反爬机制

headers={
    'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.141 Safari/537.36
'
    'Host':'httpbin.org'
}

#response对象是先创建httprequest对象

req=urllib.request.Request(url=url,headers=headers)

#装载到reques.urlopen里完成http请求,urlopen()方法返回的是一个http.client.HTTPResponse对象,需要通过read（）方法做进一步的处理

rsp=urllib.request.urlopen(req)

#返回的是httpresponse对象，实际上是html属性，使用.read().decode()解码后转化成了str字符串类型，也可以看到decode解码后中文字符能够显示出来

html=rsp.read().decode('utf-8','ignore')

requests:

import requests



url = "https://www.iqiyi.com/"

#requests可以直接构造get,post请求并发起，而urllib.request只能先构造get，post请求，再发起。

header = {"user-agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/79.0.3945.130 Safari/537.36"}

#requests库调用是requests.get方法传入url和参数,返回的对象是Response对象

get_response = requests.get(url,headers=header,params=None)

#构造POST请求

post_response=requests.post(url,headers=headers,data=None,json=None)

#打印出来是显示响应状态码

print(post_response)

#打印结果：<Response [405]>

#通过.text 方法可以返回是unicode 型的数据，一般是在网页的header中定义的编码形式

print(get_response.text)

#打印结果：网页源码内容，是str数据类型

print(get_response.content)

#打印结果：b'<!doctype html>\n<html data-n-head-ssr>\n  <head >\n    <title>\xe7\x8.....即网页的Bytes类型,需要进行解码。作用和get_response.text类似

print(get_response.json)

#打印结果：<bound method Response.json of <Response [200]>> 就是json数据

参考博客：

https://blog.youkuaiyun.com/weixin_42213622/article/details/105852794

https://blog.youkuaiyun.com/qq_38783948/article/details/88239109

https://blog.youkuaiyun.com/ytraister/article/details/106376388

原理内容都是一样的，唯有练习和尝试才能记忆深刻！