-
urllib
安装: pip install urllib
测试是否安装成功:from urllib import request
f = request.urlopen(‘http://www.baidu.com’)
f.readline() -
re
-
request
-
requests
-
selenium
-
chromedriver
pip install selenium
下载chromediver: http://npm.taobao.org/mirrors/chromedriver/
下载完之后把chromedriver的路径加入环境变量中。from selenium import webdriver
driver = webdriver.Chrome() #系统中必须安装了chrome,Python打开chrome
driver.get(‘hhttp://www.baidu.com’) #在Chrome中打开百度
driver.page_source #拿到html的内容
driver.close() #关闭浏览器 -
phantomjs
pip install phantomjs -
lxml
pip install lxml -
beautifulsoup
pip install bs4
from bs4 import BeautifulSoup
soup = BeautifulSoup(html, ‘lxml’) -
pyquery
pip install pyquery
import pyquery -
pymysql
-
pymongo
pip install pymongo
mongodb 的下载地址: https://www.mongodb.com/download-center/community 下载完msi文件直接点击安装即可 -
redis
redis 下载及相关配置工作可以参考: http://www.runoob.com/redis/redis-install.html
redis desktopmanager 下载地址: https://github.com/uglide/RedisDesktopManager/releases/tag/0.8.8 -
flask
-
django
-
jupyter
-
redis+flask 动态代理池的维护 :详细代码信息见github: https://github.com/Germey/ProxyPool
目前就收到到这么多,后续有新增的再来补充
python中爬虫相关包的安装方法
最新推荐文章于 2023-12-05 22:32:48 发布