Python下载实战技巧技术文章大纲
文件下载基础
使用requests库进行简单文件下载,保存到本地文件系统。
示例代码:
import requests
url = 'https://example.com/file.zip'
response = requests.get(url)
with open('file.zip', 'wb') as f:
f.write(response.content)
分块下载大文件
通过iter_content方法分块下载大文件,避免内存溢出。
示例代码:
chunk_size = 8192
response = requests.get(url, stream=True)
with open('large_file.zip', 'wb') as f:
for chunk in response.iter_content(chunk_size):
f.write(chunk)
进度条显示
结合tqdm库实现下载进度可视化。
示例代码:
from tqdm import tqdm
response = requests.get(url, stream=True)
total_size = int(response.headers.get('content-length', 0))
progress_bar = tqdm(total=total_size, unit='B', unit_scale=True)
with open('file.zip', 'wb') as f:
for chunk in response.iter_content(chunk_size):
progress_bar.update(len(chunk))
f.write(chunk)
progress_bar.close()
断点续传
检查本地已下载文件大小,通过Range请求头实现断点续传。
示例代码:
headers = {'Range': f'bytes={os.path.getsize("file.zip")}-'} if os.path.exists("file.zip") else None
response = requests.get(url, headers=headers, stream=True)
mode = 'ab' if headers else 'wb'
with open('file.zip', mode) as f:
for chunk in response.iter_content(chunk_size):
f.write(chunk)
多线程下载
使用concurrent.futures分块并行下载,提升速度。
示例代码:
import concurrent.futures
def download_chunk(start, end):
headers = {'Range': f'bytes={start}-{end}'}
response = requests.get(url, headers=headers)
return response.content
with concurrent.futures.ThreadPoolExecutor() as executor:
futures = [executor.submit(download_chunk, start, start+chunk_size) for start in range(0, total_size, chunk_size)]
for future in concurrent.futures.as_completed(futures):
data = future.result()
错误处理与重试
添加超时、重试机制和异常捕获,确保下载稳定性。
示例代码:
from time import sleep
max_retries = 3
for attempt in range(max_retries):
try:
response = requests.get(url, timeout=10)
break
except Exception as e:
if attempt == max_retries - 1:
raise
sleep(2 ** attempt)
代理设置
通过代理服务器下载受限资源。
示例代码:
proxies = {'http': 'http://proxy.example.com:8080', 'https': 'https://proxy.example.com:8080'}
response = requests.get(url, proxies=proxies)
文件校验
验证下载文件的完整性(MD5/SHA256)。
示例代码:
import hashlib
def verify_file(file_path, expected_hash):
sha256 = hashlib.sha256()
with open(file_path, 'rb') as f:
while chunk := f.read(8192):
sha256.update(chunk)
return sha256.hexdigest() == expected_hash

被折叠的 条评论
为什么被折叠?



