
网络爬虫
魏小魏
这个作者很懒,什么都没留下…
展开
-
requests和beautifulSoup库爬取豆瓣各类型电影
代码:# -*-coding:utf-8-*-from selenium.webdriver.chrome.options import Optionsfrom bs4 import BeautifulSoupfrom selenium import webdriverimport reimport requestsimport timeimport jsonimport rand...原创 2018-04-30 21:21:26 · 968 阅读 · 0 评论 -
根据搜索内容爬取拉钩网和招聘网的职位招聘信息
代码:import requestsimport timeimport randomip_list = ['117.135.132.107', '121.8.98.196', '194.116.198.212']#http请求头信息headers={'Accept':'application/json, text/javascript, */*; q=0.01','Accept...原创 2018-04-30 21:24:16 · 681 阅读 · 0 评论 -
根据搜索内容爬取招聘网的职位招聘信息
代码:import requestsfrom bs4 import BeautifulSoupimport timedef getHtml(url,code='gbk'): try: r = requests.get(url) r.raise_for_status() r.encoding = code return...原创 2018-04-30 21:26:03 · 438 阅读 · 0 评论 -
定向爬取新浪股票和百度股票数据
代码:import reimport requestsfrom bs4 import BeautifulSoupdef getHTMLtext(url,code = "utf-8"): try: r = requests.get(url) r.raise_for_status() r.encoding = code ...原创 2018-04-30 21:30:07 · 2023 阅读 · 0 评论 -
python爬取贴吧用户评论,用户名等相关信息
代码:# coding:utf-8# 引入requests请求包import requestsimport urllib# 给一个url参数 返回源代码def get_datasource(url): try: response = requests.get(url) if response.status_code == 200: ...原创 2018-04-30 21:32:01 · 3224 阅读 · 0 评论 -
python爬取flickr官网上图片
代码:import requestsimport urllib.requestfrom bs4 import BeautifulSoupfrom selenium import webdriverimport randomfrom selenium.webdriver.chrome.options import Optionsimport re#http请求头headers = ...原创 2018-04-30 21:33:19 · 7272 阅读 · 1 评论