下载流式文件,requests库中请求的stream设为True就可以啦。
先找一个视频地址试验一下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
# -*- coding: utf-8 -*- import requests def download_file(url, path): with requests.get(url, stream = True ) as r: chunk_size = 1024 content_size = int (r.headers[ 'content-length' ]) print '下载开始' with open (path, "wb" ) as f: for chunk in r.iter_content(chunk_size = chunk_size): f.write(chunk) if __name__ = = '__main__' : url = '就在原帖...' path = '想存哪都行' download_file(url, path) |
遭遇当头一棒:
1 |
AttributeError: __exit__ |
这文档也会骗人的么!
看样子是没有实现上下文需要的__exit__方法。既然只是为了保证要让r最后close以释放连接池,那就使用contextlib的closing特性好了:
1 2 3 4 5 6 7 8 9 10 11 12 |
# -*- coding: utf-8 -*- import requests from contextlib import closing def download_file(url, path): with closing(requests.get(url, stream = True )) as r: chunk_size = 1024 content_size = int (r.headers[ 'content-length' ]) print '下载开始' with open (path, "wb" ) as f: for chunk in r.iter_content(chunk_size = chunk_size): f.write(chunk) |
程序正常运行了,不过我盯着这文件,怎么大小不见变啊,到底是完成了多少了呢?还是要让下好的内容及时存进硬盘,还能省点内存是不是:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 |
# -*- coding: utf-8 -*- import requests from contextlib import closing import os def download_file(url, path): with closing(requests.get(url, stream = True )) as r: chunk_size = 1024 content_size = int (r.headers[ 'content-length' ]) print '下载开始' with
|