语法
方法
response = requests.get(url[, params=None, **kwargs]):发送get请求。params:请求参数,即查询字符串,传入一个字典
response = requests.post(url[, data=None, **kwargs]):发送POST请求。data: 表单数据body,传入一个字典
response对象
response.text:根据网页编码信息将content自动解码。有时会产生乱码,解决方案:# 方式1: 手动解码 requests.content.decode('utf-8') # 方式2: 自动解码 response.encoding = response.apparent_encoding
response.json():返回json数据response.content:直接从网页上面抓取的数据,没有任何解码,是byte类型response.status_code:返回响应状态码
response.url:当前访问的urlresponse.encoding:网页的编码方式
http请求
GET请求
url = 'https://www.baidu.com/s?wd={}'.format('美君') # 若有中文,会自动url编码
response = requests.get(url, headers=headers)
POST请求
请求体body数据
url = 'https://www.iqianyue.com/mypost'
body = {'name':'林在超','pass':'111111'} # body数据
response = requests.post(url, headers=headers, json=body)
表单form数据
url = 'https://www.iqianyue.com/mypost'
data = {'name':'林在超','pass':'111111'} # form数据
response = requests.post(url, headers=headers, data=data)
使用代理IP
url = 'https://www.baidu.com'
ip = '123.54.194.96:38661'
proxy = {'http': ip}
response = requests.get(url, proxies=proxy)
使用session
url = "http://www.renren.com/PLog.do"
data = {'email':'96013807@qq.com','passward':'pythonspider'}
headers = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; WOW64)...'}
session = requests.Session()
session.post(url,data=data,headers=headers)
# 携带Session模拟未登录访问个人主页
response = session.get('http://www.renren.com/880151247/profile')
爬取json数据
# 以爬取now直播用户头像为例
num = 0
url = '''https://now.qq.com/cgi-bin/now/pc/firstpage/topic_anchor_list?start=1&count=20&topic=%E6%96%B0%E4%BA%BA'''
headers = {'User-Agent':'Mozilla/5.0 (Windows NT 10.0; WOW64)...'}
response = requests.get(url, headers=headers)
json_data = response.json()
datas = json_data['result']['data']
for data in datas:
img_url = data['room_img_url']
img_data = requests.get(img_url, headers=headers).content
with open('./resources/{0}.jpg'.format(num), 'wb') as fp:
fp.write(img_data)
num += 1
本文详细介绍了如何使用Python的requests库进行GET和POST请求,包括参数传递、响应内容处理、使用代理IP、会话管理以及爬取JSON数据的示例。
63万+

被折叠的 条评论
为什么被折叠?



