前言:
从学习Python这款语言开始,小编就感觉它真是无所不能呢!唐诗是我们中国的古老传承,但是现在欣赏它的越来越少了,所以今天我们就来用Python做一个古诗生成器,以此来弘扬我国传统文化~
准备:
- Python3.6环境
- 安装谷歌的word2vec模块
- 推荐使用Pycharm作为编译器
具体步骤:
使用scrapy爬取古诗词网,总共爬取唐诗5000首
代码如下:
# -*- coding: utf-8 -*-
import scrapy
from quotesbot.items import ShiciItem
import re
class TextSpider(scrapy.Spider):
name = 'XYX'
start_urls = ['https://so.gushiwen.org/search.aspx?page=1',
'https://so.gushiwen.org/search.aspx?page=2',
'https://so.gushiwen.org/search.aspx?page=3',
'https://so.gushiwen.org/search.aspx?page=4',
'https://so.gushiwen.org/search.aspx?page=5',
'https://so.gushiwen.org/search.aspx?page=6',