AI小说推文(爬取小说榜单书目内容+AI接口生成推文文案)(python爬虫+大模型API调用)

部署运行你感兴趣的模型镜像

想要做AI小说推文,而可以使用本程序直接快速解决根据小说内容生成文案的繁琐步骤,根据官网榜单批量生成推文文案。

学习python语言,用爬虫来做兴趣或者第一个实用程序,是比较好上手并且反馈比较强的一种程序。今天来学习如何爬取小说网上热门榜单书目的小说内容并保存。


一、爬取榜单热门小说

首先我们需要下载这几个库

pip install requests
pip install parsel
pip install openai
pip install os

下一步找到想要爬取的榜单

复制URL:小说排行榜_番茄小说官网

之后就是去一步步分析网页代码,提取关键信息

这里我们使用parsel库中的css标签定位我们想要的信息,学过css的同学应该都知道,可以通过css的id、类名等确定一个标签等

url = 'https://fanqienovel.com/rank/1_2_261' #想要爬取的榜单页面
hotlist = requests.get(url=url,headers=headers).text #获取页面内容
selector = parsel.Selector(hotlist) #使用parsel解析内容
link = selector.css('.book-item-text .title a::attr(href)').getall() #使用css定位想要的信息

由于番茄的反爬机制,只能获取到前十的书目(总共是前30),下一步可以分析网页中的list文件发现剩余的数目链接位置,这里可能需要用到json解析去分析结果

下一步我们对每个数目的链接进行遍历,按章节保存

#拼接链接,爬取章节目录链接
for i in link:
    url_content = 'https://fanqienovel.com' + i 
    content = requests.get(url=url_content,headers=headers).text
    con_selector = parsel.Selector(content)
    book_title = con_selector.css('.info-name h1::text').get()  #书名
    content_list = con_selector.css('.chapter-item-title::attr(href)').getall() #章节目录

    #创建书名目录
    path = r'F:\recommend_novel' + '\\' + book_title
    os.makedirs(path,exist_ok=True)

    #拼接链接,爬取书目内容
    content_list.pop(0)
    for j in range(0,10):
        url_book = 'https://fanqienovel.com' + content_list[j]
        context = requests.get(url=url_book,headers=headers).text
        text_selector = parsel.Selector(context)
        page_title = text_selector.css('.muye-reader-title::text').get()
        novel_texe = text_selector.css('.muye-reader-content p::text').getall()
        file_name = path + '\\' + page_title + '.txt'

还是由于番茄的反爬机制,对字体进行了加密,导致爬取的小说有些内容是乱码加密的,这里参考了Python爬虫--爬取文字加密的番茄小说 - RChow - 博客园

爬虫完成代码

import requests
import parsel
import os

#解密字典
dict_data = {
    '58670': '0',
    '58413': '1',
    '58678': '2',
    '58371': '3',
    '58353': '4',
    '58480': '5',
    '58359': '6',
    '58449': '7',
    '58540': '8',
    '58692': '9',
    '58712': 'a',
    '58542': 'b',
    '58575': 'c',
    '58626': 'd',
    '58691': 'e',
    '58561': 'f',
    '58362': 'g',
    '58619': 'h',
    '58430': 'i',
    '58531': 'j',
    '58588': 'k',
    '58440': 'l',
    '58681': 'm',
    '58631': 'n',
    '58376': 'o',
    '58429': 'p',
    '58555': 'q',
    '58498': 'r',
    '58518': 's',
    '58453': 't',
    '58397': 'u',
    '58356': 'v',
    '58435': 'w',
    '58514': 'x',
    '58482': 'y',
    '58529': 'z',
    '58515': 'A',
    '58688': 'B',
    '58709': 'C',
    '58344': 'D',
    '58656': 'E',
    '58381': 'F',
    '58576': 'G',
    '58516': 'H',
    '58463': 'I',
    '58649': 'J',
    '58571': 'K',
    '58558': 'L',
    '58433': 'M',
    '58517': 'N',
    '58387': 'O',
    '58687': 'P',
    '58537': 'Q',
    '58541': 'R',
    '58458': 'S',
    '58390': 'T',
    '58466': 'U',
    '58386': 'V',
    '58697': 'W',
    '58519': 'X',
    '58511': 'Y',
    '58634': 'Z',
    '58611': '的',
    '58590': '一',
    '58398': '是',
    '58422': '了',
    '58657': '我',
    '58666': '不',
    '58562': '人',
    '58345': '在',
    '58510': '他',
    '58496': '有',
    '58654': '这',
    '58441': '个',
    '58493': '上',
    '58714': '们',
    '58618': '来',
    '58528': '到',
    '58620': '时',
    '58403': '大',
    '58461': '地',
    '58481': '为',
    '58700': '子',
    '58708': '中',
    '58503': '你',
    '58442': '说',
    '58639': '生',
    '58506': '国',
    '58663': '年',
    '58436': '着',
    '58563': '就',
    '58391': '那',
    '58357': '和',
    '58354': '要',
    '58695': '她',
    '58372': '出',
    '58696': '也',
    '58551': '得',
    '58445': '里',
    '58408': '后',
    '58599': '自',
    '58424': '以',
    '58394': '会',
    '58348': '家',
    '58426': '可',
    '58673': '下',
    '58417': '而',
    '58556': '过',
    '58603': '天',
    '58565': '去',
    '58604': '能',
    '58522': '对',
    '58632': '小',
    '58622': '多',
    '58350': '然',
    '58605': '于',
    '58617': '心',
    '58401': '学',
    '58637': '么',
    '58684': '之',
    '58382': '都',
    '58464': '好',
    '58487': '看',
    '58693': '起',
    '58608': '发',
    '58392': '当',
    '58474': '没',
    '58601': '成',
    '58355': '只',
    '58573': '如',
    '58499': '事',
    '58469': '把',
    '58361': '还',
    '58698': '用',
    '58489': '第',
    '58711': '样',
    '58457': '道',
    '58635': '想',
    '58492': '作',
    '58647': '种',
    '58623': '开',
    '58521': '美',
    '58609': '总',
    '58530': '从',
    '58665': '无',
    '58652': '情',
    '58676': '己',
    '58456': '面',
    '58581': '最',
    '58509': '女',
    '58488': '但',
    '58363': '现',
    '58685': '前',
    '58396': '些',
    '58523': '所',
    '58471': '同',
    '58485': '日',
    '58613': '手',
    '58533': '又',
    '58589': '行',
    '58527': '意',
    '58593': '动',
    '58699': '方',
    '58707': '期',
    '58414': '它',
    '58596': '头',
    '58570': '经',
    '58660': '长',
    '58364': '儿',
    '58526': '回',
    '58501': '位',
    '58638': '分',
    '58404': '爱',
    '58677': '老',
    '58535': '因',
    '58629': '很',
    '58577': '给',
    '58606': '名',
    '58497': '法',
    '58662': '间',
    '58479': '斯',
    '58532': '知',
    '58380': '世',
    '58385': '什',
    '58405': '两',
    '58644': '次',
    '58578': '使',
    '58505': '身',
    '58564': '者',
    '58412': '被',
    '58686': '高',
    '58624': '已',
    '58667': '亲',
    '58607': '其',
    '58616': '进',
    '58368': '此',
    '58427': '话',
    '58423': '常',
    '58633': '与',
    '58525': '活',
    '58543': '正',
    '58418': '感',
    '58597': '见',
    '58683': '明',
    '58507': '问',
    '58621': '力',
    '58703': '理',
    '58438': '尔',
    '58536': '点',
    '58384': '文',
    '58484': '几',
    '58539': '定',
    '58554': '本',
    '58421': '公',
    '58347': '特',
    '58569': '做',
    '58710': '外',
    '58574': '孩',
    '58375': '相',
    '58645': '西',
    '58592': '果',
    '58572': '走',
    '58388': '将',
    '58370': '月',
    '58399': '十',
    '58651': '实',
    '58546': '向',
    '58504': '声',
    '58419': '车',
    '58407': '全',
    '58672': '信',
    '58675': '重',
    '58538': '三',
    '58465': '机',
    '58374': '工',
    '58579': '物',
    '58402': '气',
    '58702': '每',
    '58553': '并',
    '58360': '别',
    '58389': '真',
    '58560': '打',
    '58690': '太',
    '58473': '新',
    '58512': '比',
    '58653': '才',
    '58704': '便',
    '58545': '夫',
    '58641': '再',
    '58475': '书',
    '58583': '部',
    '58472': '水',
    '58478': '像',
    '58664': '眼',
    '58586': '等',
    '58568': '体',
    '58674': '却',
    '58490': '加',
    '58476': '电',
    '58346': '主',
    '58630': '界',
    '58595': '门',
    '58502': '利',
    '58713': '海',
    '58587': '受',
    '58548': '听',
    '58351': '表',
    '58547': '德',
    '58443': '少',
    '58460': '克',
    '58636': '代',
    '58585': '员',
    '58625': '许',
    '58694': '稜',
    '58428': '先',
    '58640': '口',
    '58628': '由',
    '58612': '死',
    '58446': '安',
    '58468': '写',
    '58410': '性',
    '58508': '马',
    '58594': '光',
    '58483': '白',
    '58544': '或',
    '58495': '住',
    '58450': '难',
    '58643': '望',
    '58486': '教',
    '58406': '命',
    '58447': '花',
    '58669': '结',
    '58415': '乐',
    '58444': '色',
    '58549': '更',
    '58494': '拉',
    '58409': '东',
    '58658': '神',
    '58557': '记',
    '58602': '处',
    '58559': '让',
    '58610': '母',
    '58513': '父',
    '58500': '应',
    '58378': '直',
    '58680': '字',
    '58352': '场',
    '58383': '平',
    '58454': '报',
    '58671': '友',
    '58668': '关',
    '58452': '放',
    '58627': '至',
    '58400': '张',
    '58455': '认',
    '58416': '接',
    '58552': '告',
    '58614': '入',
    '58582': '笑',
    '58534': '内',
    '58701': '英',
    '58349': '军',
    '58491': '候',
    '58467': '民',
    '58365': '岁',
    '58598': '往',
    '58425': '何',
    '58462': '度',
    '58420': '山',
    '58661': '觉',
    '58615': '路',
    '58648': '带',
    '58470': '万',
    '58377': '男',
    '58520': '边',
    '58646': '风',
    '58600': '解',
    '58431': '叫',
    '58715': '任',
    '58524': '金',
    '58439': '快',
    '58566': '原',
    '58477': '吃',
    '58642': '妈',
    '58437': '变',
    '58411': '通',
    '58451': '师',
    '58395': '立',
    '58369': '象',
    '58706': '数',
    '58705': '四',
    '58379': '失',
    '58567': '满',
    '58373': '战',
    '58448': '远',
    '58659': '格',
    '58434': '士',
    '58679': '音',
    '58432': '轻',
    '58689': '目',
    '58591': '条',
    '58682': '呢'
}

#榜单网页,回去榜单书目链接
headers = {
    'Cookie':"x-web-secsdk-uid=b1bc4892-fdbb-4add-9155-bce7d899dbd6; Hm_lvt_2667d29c8e792e6fa9182c20a3013175=1756685113; HMACCOUNT=35DE050AC5E72E8F; s_v_web_id=verify_mf0cylb2_zVItNMNx_7oOY_4WE4_8cKF_kH7yhEzyE43B; csrf_session_id=1551ec55d182bf1036912fdf3500bef6; novel_web_id=7544905032774665778; Hm_lpvt_2667d29c8e792e6fa9182c20a3013175=1756686278; ttwid=1%7CJ7VBrD5LHXa6Ui1qXBIHi4fCf-CeGe-TQ5YIT-v8MfQ%7C1756686274%7C0bc91977c931352ecc751657e7a4ea5eb2d1936701435f13c9f2f8d29d803379",
    'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/139.0.0.0 Safari/537.36'
}

url = 'https://fanqienovel.com/rank/1_2_261'
hotlist = requests.get(url=url,headers=headers).text
selector = parsel.Selector(hotlist)
link = selector.css('.book-item-text .title a::attr(href)').getall()


#拼接链接,爬取章节目录链接
for i in link:
    url_content = 'https://fanqienovel.com' + i
    content = requests.get(url=url_content,headers=headers).text
    con_selector = parsel.Selector(content)
    book_title = con_selector.css('.info-name h1::text').get()  #书名
    content_list = con_selector.css('.chapter-item-title::attr(href)').getall() #章节目录

    #创建书名目录
    path = r'F:\recommend_novel' + '\\' + book_title
    os.makedirs(path,exist_ok=True)

    #拼接链接,爬取书目内容
    content_list.pop(0)
    for j in range(0,10):
        url_book = 'https://fanqienovel.com' + content_list[j]
        context = requests.get(url=url_book,headers=headers).text
        text_selector = parsel.Selector(context)
        page_title = text_selector.css('.muye-reader-title::text').get()
        novel_texe = text_selector.css('.muye-reader-content p::text').getall()
        file_name = path + '\\' + page_title + '.txt'

        #解码加密字体,还原文本
        novel_content = ''
        for paragraph in novel_texe:
            for index in paragraph:
                try:
                    word = dict_data[str(ord(index))]
                except KeyError:
                    word = index
                novel_content += word
            novel_content += '\n'
        with open(file_name,'a',encoding='utf-8') as f:
            f.write(novel_content)
            print(file_name + "已完成")




二、大模型API接口调用

这里推荐模型使用网站OpenRouter

这里模型很丰富,并且还有许多免费的和国外的模型,都可以调用,里面提供了API调用示例,操作起来很方便

from openai import OpenAI
import os

tip = '你是一名作小说推文的自媒体作者,我需要你把这些内容改成更适合推文视频的文案,开头要有爆点,足够吸引观众,整篇文案通顺,换成主角第一人称叙述故事,情节跌宕冲突明显一些,文案要求符合时长达到3-5分钟视频的字数,只需要一整段文案'
path = r"F:\recommend_novel"
dir=[]

def get_model_response(tip,prompt):
    client = OpenAI(
        base_url="https://openrouter.ai/api/v1",
        api_key="****************",
    )

    completion = client.chat.completions.create(
        extra_body={},
        model="google/gemini-2.5-flash-image-preview:free",
        messages=[
            {
                "role": "system",
                "content": tip
            },
            {
                "role": "user",
                "content": prompt
            }
        ]
    )
    return completion.choices[0].message.content

for dirpath, dirname, filename in os.walk(path):
    if len(filename) > 0:
        prompt = ''
        for i in filename:
            filepath = dirpath + '\\' + i
            file = open(filepath,'r',encoding='utf-8')
            prompt += file.read()
        essay = get_model_response(tip,prompt)
        essay_path = dirpath + '\\' + "推文文案.txt"
        essay_file = open(essay_path,'a',encoding='utf-8')
        essay_file.write(essay)
        print("已完成"+essay_path)

您可能感兴趣的与本文相关的镜像

Python3.9

Python3.9

Conda
Python

Python 是一种高级、解释型、通用的编程语言,以其简洁易读的语法而闻名,适用于广泛的应用,包括Web开发、数据分析、人工智能和自动化脚本

评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值