1.创建scrapy项目
scrapy startproject maoyanspider
2.写items.py
# -*- coding: utf-8 -*-
# Define here the models for your scraped items
#
# See documentation in:
# https://docs.scrapy.org/en/latest/topics/items.html
import scrapy
class MaoyanspiderItem(scrapy.Item):
# define the fields for your item here like:
movie_name = scrapy.Field()
actor = scrapy.Field()
time = scrapy.Field()
score = scrapy.Field()
img = scrapy.Field()
3.设置setting.py
把setting.py中的DEFAULT_REQUEST_HEADERS的注释去除,添加User-Agent
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.100 Safari/537.36',
再去除ITEM_PIPELINES的注释
4.创建爬虫文件
scrapy genspider Myspider 'maoyan.com'
创建成