版权声明:本文为博主原创文章,转载 请注明出处: https://blog.youkuaiyun.com/sc2079/article/details/82423865
-写在前面
本篇博客主要是爬虫伪装技术的应用,包括用户代理,IP代理,Cooikes,适合爬虫初学者。文中如有不足,请指正。
PS:网上资源整合,如有侵权,联系删除
-环境配置安装
运行环境:Python3.6、Spyder
依赖的模块:urllib,requests.random等
-正文
1、用户代理
def LoadUserAgents(uafile):
uas = []
with open(uafile, 'rb') as uaf:
for ua in uaf.readlines():
if ua:
uas.append(ua.strip()[1:-1 - 1])
random.shuffle(uas)
return uas
uas = LoadUserAgents("user_agents.txt")
下载user_agents.txt文档:链接,设置好路径后,运行即可。
2、cookies
Cookies = ['_lxsdk_s=b458c64cc9ef92adce92c199a6b7%7C%7C4; __mta=214784936.1512623224439.1512623224439.1512623226558.2; client-id=8d050bd8-51b6-46fc-be96-cb6871d91fe7; ci=89; webloc_geo=31.778615%2C119.958931%2Cwgs84; _lxsdk=1602f5eb154c8-05c63a41825c168-1b451e24-13c680-1602f5eb154c8; uuid=73a65301-f351-4cdd-bfcd-35ae40aea13c',
'_lxsdk_s=55618265813fc43043ce562acf29%7C%7C2; ci=89; __mta=252597174.1512623399814.1512623399814.1512623399814.1; _lxsdk=1602f6161c1c8-034ae66c3842d88-1b451e24-13c680-1602f6161c170; client-id=8ca2fe5f-ba96-41ae-b919-f8d3665e2f10; webloc_geo=31.778620%2C119.958935%2Cwgs84; uuid=ae73c608-8db3-4724-b4bf-91f4a8c701c5',
'_lxsdk_s=4ffb12d5a90c2361745e6d0993a2%7C%7