
爬虫
笑笑曦
这个作者很懒,什么都没留下…
展开
-
爬取文字
import requestsfrom bs4 import BeautifulSoupimport sysreload(sys)sys.setdefaultencoding('utf-8')header = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, ...原创 2018-02-25 18:09:02 · 281 阅读 · 0 评论 -
爬取图片
import requestsfrom bs4 import BeautifulSoupfrom multiprocessing import Poolimport datetimeimport osimport sysreload(sys)sys.setdefaultencoding('utf-8')starttime=datetime.datetime.now()all_...原创 2018-02-25 18:11:11 · 266 阅读 · 0 评论 -
爬取图片分类保存
import requestsfrom bs4 import BeautifulSoupimport osimport sys#http://blog.youkuaiyun.com/baidu_35085676/article/details/68958267#安卓端需要此语句reload(sys)sys.setdefaultencoding('utf-8')if(os.name == '...转载 2018-04-18 21:59:19 · 573 阅读 · 0 评论 -
同步爬取天虹商城图片链接
import requestsimport timefrom bs4 import BeautifulSoupimport re#判断链接打开是否正常def get_url(url): response=requests.get(url) if response.status_code==200: print('%s' % url) p...原创 2018-10-18 19:19:57 · 174 阅读 · 0 评论 -
aiohttp异步爬取数据发送请求--小试
import aiohttpimport asyncioimport timefrom bs4 import BeautifulSoupimport reimport requests#限制启动线程数sema=asyncio.Semaphore(100)#判断链接是否正常打开async def get_url(url): # conn=aiohttp.TCPConn...原创 2018-11-29 09:51:28 · 741 阅读 · 0 评论