requests
查看requests有哪些方法
传参
二进制
json处理
原始数据处理
提交表单
以form的格式传数据
以json的格式传数据
cookie
获取cookie
带上cookie
使用cookie登陆
import requests
headers = { 'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/62.0.3202.62 Safari/537.36'}
cookies = {'cookie':'复制浏览器的cookie'}
url = 'https://www.baidu.com/'
r = requests.get( url = url, cookies = cookies, headers = headers )
with open('baidu.txt', 'wb+') as f:
f.write(r.content)
使用用户名和密码登陆
import requests
import html5lib
import re
from bs4 import BeautifulSoup
s = requests.Session()
url_login = 'http://accounts.douban.com/login'
formdata = {
'redir':'https://www.douban.com',
'form_email': '用户注册邮箱',
'form_password': 't密码',
'login': u'登陆'
}
headers = {'user-agent': 'Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/44.0.2403.157 Safari/537.36'}
r = s.post(url_login, data = formdata, headers = headers)
content = r.text
soup = BeautifulSoup(content, 'html5lib')
captcha = soup.find('img', id = 'captcha_image')
print(soup)
if captcha:
captcha_url = captcha['src']
re_captcha_id = r'<input type="hidden" name="captcha-id" value="(.*?)"/'
captcha_id = re.findall(re_captcha_id, content)
print(captcha_id)
print(captcha_url)
captcha_text = input('Please input the captcha:')
formdata['captcha-solution'] = captcha_text
formdata['captcha-id'] = captcha_id
r = s.post(url_login, data = formdata, headers = headers)
with open('contacts.txt', 'w+', encoding = 'utf-8') as f:
f.write(r.text)