阿布云对爬虫新手比较友好,通过购买之后可以生成通行证书以及通行密钥,可以选取http隧道socks隧道,以及专业版,经典版,动态版进行生成,在接入文档中有你需要选择的ip代理池使用信息,php-python:
我选取了单py和scrapy框架处理的方式:
import requests
# 要访问的目标页面
targetUrl = "http://test.abuyun.com"
# 代理服务器
proxyHost = "http-dyn.abuyun.com"
proxyPort = "9020"
# 代理隧道验证信息通行证书以及通行密钥
proxyUser = "************"
proxyPass = "************"
proxyMeta = "http://%(user)s:%(pass)s@%(host)s:%(port)s" % {
"host" : proxyHost,
"port" : proxyPort,
"user" : proxyUser,
"pass" : proxyPass,
}
proxies = {
"http" : proxyMeta,
"https" : proxyMeta,
}
resp = requests.get(targetUrl, proxies=proxies)
print(resp.status_code)
print(resp.text)
scrapy框架:
import base64
# 代理服务器
proxyServer = "http://http-dyn.abuyun.com:9020"
# 代理隧道验证信息
proxyUser = "H01234567890123D"
proxyPass = "0123456789012345"
# for Python2
proxyAuth = "Basic " + base64.b64encode(proxyUser + ":" + proxyPass)
# for Python3
#proxyAuth = "Basic " + base64.urlsafe_b64encode(bytes((proxyUser + ":" + proxyPass), "ascii")).decode("utf8")
class ProxyMiddleware(object):
def process_request(self, request, spider):
request.meta["proxy"] = proxyServer
request.headers["Proxy-Authorization"] = proxyAuth
--以上取材来自阿布云代理,无任何商用,自认为比较方便好用的代理软件,望采用,谢谢(qq:858703032)