一些函数:
r.text 字符串形式查看响应
r.content 字节类型查看响应
r.encoding 查看或者设置编码类型
r.status_code 查看状态码
r.headers 查看响应头部
r.url 查看所请求的url
r.json() 查看json数据
反爬措施:
https://editor.youkuaiyun.com/md/?articleId=104331020
一、发送 Get 请求:
(1)不带参数的Get请求:
所用函数:requests.get(url, headers)
import requests
url = 'http://www.baidu.com/'
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:72.0) Gecko/20100101 Firefox/72.0',
}
r = requests.get(url=url, headers=headers)
# print(r.encoding)
# r.encoding = 'utf-8'
print(r.text)
(2)带参数的Get请求:
所用函数:requests.get(url, headers, params)
import requests
url = 'http://www.baidu.com/s'
data = {
'ie': 'utf8',
'kw': '中国',
}
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:72.0) Gecko/20100101 Firefox/72.0',
}
# parameters 意思是参数
r = requests.get(url=url, headers=headers, params=data)
# 把结果写到文件中
with open('baidu.html', 'wb')as fp:
fp.write(r.content)
二、发送post请求:
所用函数:requests.post(url, headers, data)
import requests
url = 'https://cn.bing.com/ttranslatev3?isVertical=1&&IG=4E337E0BDDC94836AC2657E04CB61EE5&IID=translator.5028.4'
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:72.0) Gecko/20100101 Firefox/72.0',
}
formdata = {
'fromLang': 'auto-detect',
'text': 'lion',
'to': 'zh-Hans',
}
r = requests.post(url=url, headers=headers, data=formdata)
print(r.json())
三、更换代理:
所用函数:requests.get(url, headers, proxies)
或:requests.post(url, headers, data, proxies)
其中proxies
以字典的形式使用
import requests
proxies = {
'http': 'http://xxx.xxx.xxx.xxx:xxx',
'https':'https://xxx.xxx.xxx.xxx:xxx',
}
url = 'https://www.baidu.com/s?wd=ip&ie=utf-8'
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:72.0) Gecko/20100101 Firefox/72.0',
}
r = requests.get(url=url, headers=headers, proxies=proxies)
with open('ip.html', 'wb')as fp:
fp.write(r.content)
四、创建会话:
所用函数:s = requests.Session()
在使用时先在前面创建一个会话s
,即s = requests.Session()
,把requests.get()
换成s.get()
即可
用于验证码登录和维持登录等需要保持身份的事情
五、异常处理:
所用函数:requests.HTTPError
import requests
url = 'http://www.maodan.com/'
try:
r = requests.get(url=url)
print(r.text)
except requests.HTTPError as e:
print(e)