urllib模块在Python2在Python3中的区别:python 3.x中urllib库和urilib2库合并成了urllib库。其中一些函数调用也有了一些小小变动。记录如下:
一、python2中用urllib、urllib2模块爬取网页
以爬取优快云网页为例:
import urllib
import urllib2
values = {"username":"XXXXXX", "password":"YYYYYY"}
data = urllib.urlencode(values)
url = "https://passport.youkuaiyun.com/account/login?from=http://my.youkuaiyun.com/my/mycsdn"
request = urllib2.Request(url, data)
response = urllib2.urlopen(request)
print response.read()
二、python3中用urllib模块爬取网页
python 3.x中urllib库和urilib2库合并成了urllib库。其中:
urllib.urlencode()变成了urllib.parse.urlencode().encode()
urllib2.urlopen()变成了urllib.request.urlopen()
urllib2.Request()变成了urllib.request.Request()
import urllib.request
import urllib.parse
values = {"username":"XXXXXX", "password":"YYYYYY"}
data = urllib.parse.urlencode(values).encode()
url = "https://passport.youkuaiyun.com/account/login?from=http://my.youkuaiyun.com/my/mycsdn"
request = urllib.request.Request(url, data)
response = urllib.request.urlopen(request)
print(response.read())