【快速上手】Requests库:Python中最优雅的HTTP客户端库解析
【免费下载链接】requests 项目地址: https://gitcode.com/gh_mirrors/req/requests
还在为Python中的HTTP请求处理而烦恼吗?面对复杂的网络请求、认证、会话管理等需求,你是否曾感到束手无策?Requests库的出现彻底改变了这一切,它以其简洁优雅的API设计,成为Python开发者处理HTTP请求的首选工具。
通过本文,你将掌握:
- ✅ Requests库的核心特性和优势
- ✅ 从基础到高级的完整使用指南
- ✅ 实战代码示例和最佳实践
- ✅ 常见问题解决方案和性能优化技巧
- ✅ 与其他HTTP库的对比分析
为什么选择Requests库?
Requests是Python生态系统中最受欢迎的HTTP客户端库,每周下载量超过3000万次,被100多万个仓库所依赖。它的设计哲学是"HTTP for Humans",让HTTP请求变得简单直观。
核心优势对比
| 特性 | Requests | urllib3 | httpx | aiohttp |
|---|---|---|---|---|
| API简洁性 | ⭐⭐⭐⭐⭐ | ⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐ |
| 同步支持 | ✅ | ✅ | ✅ | ❌ |
| 异步支持 | ❌ | ❌ | ✅ | ✅ |
| 连接池 | ✅ | ✅ | ✅ | ✅ |
| 自动解码 | ✅ | ❌ | ✅ | ✅ |
| 文件上传 | ✅ | ⚠️ | ✅ | ✅ |
| 社区生态 | ⭐⭐⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ |
快速开始:安装与基础使用
安装Requests
pip install requests
你的第一个HTTP请求
import requests
# 发送GET请求
response = requests.get('https://api.github.com/events')
# 检查状态码
print(f"状态码: {response.status_code}")
# 获取响应内容
print(f"响应内容: {response.text[:100]}...")
# 解析JSON数据
data = response.json()
print(f"事件数量: {len(data)}")
核心功能深度解析
1. 各种HTTP方法的使用
import requests
# GET请求 - 获取资源
response = requests.get('https://httpbin.org/get')
# POST请求 - 创建资源
data = {'key': 'value'}
response = requests.post('https://httpbin.org/post', data=data)
# PUT请求 - 更新资源
response = requests.put('https://httpbin.org/put', data=data)
# DELETE请求 - 删除资源
response = requests.delete('https://httpbin.org/delete')
# HEAD请求 - 获取头部信息
response = requests.head('https://httpbin.org/get')
# OPTIONS请求 - 查询服务器支持的方法
response = requests.options('https://httpbin.org/get')
2. 参数传递与URL编码
import requests
# 查询参数自动编码
params = {
'key1': 'value1',
'key2': 'value2',
'key3': ['value3', 'value4'] # 支持多值参数
}
response = requests.get('https://httpbin.org/get', params=params)
print(f"最终URL: {response.url}")
# 输出: https://httpbin.org/get?key1=value1&key2=value2&key3=value3&key3=value4
3. 请求与响应处理流程
4. 响应内容处理
import requests
response = requests.get('https://api.github.com/events')
# 文本内容(自动解码)
print(response.text)
# 二进制内容
print(response.content)
# JSON内容(自动解析)
data = response.json()
print(data)
# 响应头(不区分大小写)
print(response.headers['content-type'])
print(response.headers.get('Content-Type'))
# 状态码检查
if response.status_code == 200:
print("请求成功")
else:
response.raise_for_status() # 抛出异常
高级特性详解
1. 会话管理(Session)
import requests
# 创建会话对象
with requests.Session() as session:
# 设置公共头部
session.headers.update({'User-Agent': 'my-app/1.0.0'})
# 第一次请求(会保存cookies)
response1 = session.get('https://httpbin.org/cookies/set/sessioncookie/123456789')
# 第二次请求(自动携带cookies)
response2 = session.get('https://httpbin.org/cookies')
print(response2.json())
# 输出: {'cookies': {'sessioncookie': '123456789'}}
2. 认证机制
import requests
from requests.auth import HTTPBasicAuth, HTTPDigestAuth
# 基本认证
response = requests.get(
'https://httpbin.org/basic-auth/user/pass',
auth=('user', 'pass') # 简写形式
)
# 或者使用HTTPBasicAuth类
auth = HTTPBasicAuth('user', 'pass')
response = requests.get('https://httpbin.org/basic-auth/user/pass', auth=auth)
# 摘要认证
response = requests.get(
'https://httpbin.org/digest-auth/auth/user/pass',
auth=HTTPDigestAuth('user', 'pass')
)
3. 文件上传与下载
import requests
# 文件上传
files = {'file': open('report.xls', 'rb')}
response = requests.post('https://httpbin.org/post', files=files)
# 多文件上传
multiple_files = [
('images', ('foo.png', open('foo.png', 'rb'), 'image/png')),
('images', ('bar.png', open('bar.png', 'rb'), 'image/png'))
]
response = requests.post('https://httpbin.org/post', files=multiple_files)
# 流式下载大文件
response = requests.get('https://httpbin.org/stream/100', stream=True)
with open('large_file.txt', 'wb') as f:
for chunk in response.iter_content(chunk_size=8192):
if chunk:
f.write(chunk)
4. 超时与重试机制
import requests
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry
# 简单超时设置
try:
response = requests.get('https://httpbin.org/delay/5', timeout=2.5)
except requests.exceptions.Timeout:
print("请求超时")
# 高级重试配置
retry_strategy = Retry(
total=3,
backoff_factor=0.1,
status_forcelist=[429, 500, 502, 503, 504],
allowed_methods=["GET", "POST"]
)
adapter = HTTPAdapter(max_retries=retry_strategy)
session = requests.Session()
session.mount("https://", adapter)
session.mount("http://", adapter)
response = session.get("https://httpbin.org/status/500")
实战案例:构建完整的API客户端
import requests
from typing import Dict, Any, Optional
import json
class GitHubAPI:
def __init__(self, token: str):
self.base_url = "https://api.github.com"
self.session = requests.Session()
self.session.headers.update({
'Authorization': f'token {token}',
'Accept': 'application/vnd.github.v3+json',
'User-Agent': 'My-GitHub-App/1.0.0'
})
def get_user_info(self, username: str) -> Optional[Dict[str, Any]]:
"""获取用户信息"""
url = f"{self.base_url}/users/{username}"
response = self.session.get(url)
if response.status_code == 200:
return response.json()
else:
response.raise_for_status()
def create_repo(self, name: str, description: str = "", private: bool = False) -> Dict[str, Any]:
"""创建仓库"""
url = f"{self.base_url}/user/repos"
data = {
'name': name,
'description': description,
'private': private,
'auto_init': True
}
response = self.session.post(url, json=data)
response.raise_for_status()
return response.json()
def list_issues(self, owner: str, repo: str, state: str = 'open') -> list:
"""列出issues"""
url = f"{self.base_url}/repos/{owner}/{repo}/issues"
params = {'state': state}
response = self.session.get(url, params=params)
response.raise_for_status()
return response.json()
# 使用示例
if __name__ == "__main__":
# 替换为你的GitHub token
github = GitHubAPI("your_github_token")
# 获取用户信息
user_info = github.get_user_info("octocat")
print(f"用户: {user_info['login']}")
# 创建仓库
repo = github.create_repo("my-new-repo", "A test repository")
print(f"创建仓库: {repo['html_url']}")
# 列出issues
issues = github.list_issues("octocat", "hello-world")
print(f"找到 {len(issues)} 个issue")
性能优化与最佳实践
1. 连接池管理
import requests
from requests.adapters import HTTPAdapter
# 配置连接池
session = requests.Session()
# 创建适配器并配置连接池
adapter = HTTPAdapter(
pool_connections=10, # 连接池数量
pool_maxsize=100, # 最大连接数
max_retries=3, # 重试次数
pool_block=True # 连接池满时阻塞
)
# 挂载适配器
session.mount('http://', adapter)
session.mount('https://', adapter)
# 使用配置好的会话
for i in range(20):
response = session.get(f'https://httpbin.org/get?request={i}')
print(f"请求 {i}: {response.status_code}")
2. 批量请求处理
import requests
from concurrent.futures import ThreadPoolExecutor, as_completed
def fetch_url(url):
"""获取单个URL"""
try:
response = requests.get(url, timeout=10)
return url, response.status_code, response.text[:100]
except Exception as e:
return url, None, str(e)
# 批量处理URLs
urls = [
'https://httpbin.org/get',
'https://httpbin.org/ip',
'https://httpbin.org/user-agent',
'https://httpbin.org/headers'
]
# 使用线程池并发请求
with ThreadPoolExecutor(max_workers=4) as executor:
future_to_url = {executor.submit(fetch_url, url): url for url in urls}
for future in as_completed(future_to_url):
url = future_to_url[future]
try:
result = future.result()
print(f"URL: {result[0]}, 状态码: {result[1]}")
except Exception as exc:
print(f'{url} 生成异常: {exc}')
3. 错误处理与重试策略
import requests
import time
from typing import Callable
def retry_request(
request_func: Callable,
max_retries: int = 3,
backoff_factor: float = 0.1,
retry_codes: list = [429, 500, 502, 503, 504]
) -> requests.Response:
"""带重试机制的请求函数"""
for attempt in range(max_retries):
try:
response = request_func()
# 检查是否需要重试
if response.status_code in retry_codes:
if attempt < max_retries - 1:
sleep_time = backoff_factor * (2 ** attempt)
time.sleep(sleep_time)
continue
return response
except (requests.exceptions.ConnectionError,
requests.exceptions.Timeout) as e:
if attempt < max_retries - 1:
sleep_time = backoff_factor * (2 ** attempt)
time.sleep(sleep_time)
continue
else:
raise e
raise requests.exceptions.RetryError("超过最大重试次数")
# 使用示例
def make_request():
return requests.get('https://httpbin.org/status/500')
try:
response = retry_request(make_request, max_retries=5)
print(f"最终状态码: {response.status_code}")
except Exception as e:
print(f"请求失败: {e}")
常见问题与解决方案
1. SSL证书验证问题
import requests
import ssl
# 禁用SSL验证(不推荐生产环境使用)
response = requests.get('https://expired.badssl.com/', verify=False)
# 使用自定义CA证书
response = requests.get('https://example.com', verify='/path/to/cert.pem')
# 忽略特定主机名的证书验证
session = requests.Session()
session.verify = False # 全局禁用
# 或者为特定请求禁用
response = session.get('https://example.com', verify=False)
2. 网络代理设置
import requests
# 网络代理
proxies = {
'http': 'http://10.10.1.10:3128',
'https': 'http://10.10.1.10:1080',
}
# SOCKS代理
proxies = {
'http': 'socks5://user:pass@host:port',
'https': 'socks5://user:pass@host:port'
}
response = requests.get('http://example.org', proxies=proxies)
# 会话级别的代理设置
session = requests.Session()
session.proxies.update(proxies)
3. 自定义头部与User-Agent
import requests
headers = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36',
'Accept': 'application/json',
'Accept-Language': 'en-US,en;q=0.9',
'Referer': 'https://example.com',
}
# 单个请求的头部
response = requests.get('https://httpbin.org/headers', headers=headers)
# 会话级别的头部
session = requests.Session()
session.headers.update(headers)
总结与展望
Requests库以其简洁优雅的API设计、强大的功能和出色的性能,成为Python HTTP客户端开发的事实标准。通过本文的学习,你应该已经掌握了:
- 基础使用:从简单的GET请求到复杂的API交互
- 高级特性:会话管理、认证机制、文件处理等
- 性能优化:连接池、批量处理、错误重试
- 实战应用:构建完整的API客户端
虽然Requests目前只支持同步请求,但在大多数应用场景下已经足够。对于需要异步处理的场景,可以考虑使用httpx或aiohttp等库。
Requests库的简洁性和易用性使其成为学习HTTP客户端开发的绝佳起点。无论你是初学者还是经验丰富的开发者,都能从中受益。现在就开始使用Requests,让你的HTTP请求处理变得更加优雅高效吧!
下一步学习建议:
- 深入阅读官方文档了解更高级的特性
- 尝试使用Requests构建自己的REST API客户端
- 学习如何编写自定义的认证处理器
- 探索Requests与其他库(如BeautifulSoup、Pandas)的集成使用
记得在实际项目中使用时,始终遵循最佳实践,处理好异常情况,确保代码的健壮性和可维护性。
【免费下载链接】requests 项目地址: https://gitcode.com/gh_mirrors/req/requests
创作声明:本文部分内容由AI辅助生成(AIGC),仅供参考



