
scrapy
qq123aa2006
这个作者很懒,什么都没留下…
展开
-
用scrapy爬取小说网站,并保存到数据库
spider.py # -*- coding: utf-8 -*- import scrapy import uuid from datetime import datetime from novel.items import NovelItem,ChapterItem class A17kSpider(scrapy.Spider): name = '17k' allowed_...原创 2019-04-01 18:01:08 · 1293 阅读 · 0 评论 -
scrapy 在middelware里面加上随机headers 和代理
from fake_useragent import UserAgent class RandomUserAgentMiddlerware(object): def __init__(self,crawler): super(RandomUserAgentMiddlerware,self).__init__() self.ua = UserAgent() ...原创 2019-05-05 08:37:22 · 225 阅读 · 0 评论 -
自建代理池
MAX_SCORE = 100 MIN_SCORE = 0 INITIAL_SCORE = 10 REDIS_HOST = "127.0.0.1" REDIS_PORT = 6379 REDIS_PASSWORD = None REDIS_KEY = "proxies" import redis from random import choice import time import ...原创 2019-05-05 08:50:34 · 215 阅读 · 0 评论