爬取网站上壁纸下载链接和名字，配合文件使用

本文链接：https://blog.youkuaiyun.com/weixin_61837710/article/details/130853038

该代码示例使用Python的requests,BeautifulSoup库爬取SteamWorkshop页面，针对特定关键词（如Genshin）筛选超链接，并将结果保存到txt文件中。程序每50个链接休息一次，总时间由time模块控制。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

先贴出代码：

import time
from random import random
import requests
from bs4 import BeautifulSoup

# http://steamworkshop.download/latest/rowstart/0/
def main():
    count = 1
    keywords = ["Genshin"] # 替换为你要筛选的关键词，比如说r18，xray，好了，不说了
    key = "Download: "
    for i in range(40):
        url = "http://steamworkshop.download/latest/rowstart/"+str(50*i)+"/" 
        response = requests.get(url)
        soup = BeautifulSoup(response.text, "html.parser")
        # 保存数据open函数
        # 保存txt文件路径
        # 'a'模式可以保证只是添加，append
        with open('color.txt','a',encoding='utf-8') as f:
            for link in soup.find_all("a"):
                if key in link.text:
                    for keyword in keywords:
                        if keyword in link.text:
                            print(link.get("href"),link.text)
                            f.write(link.get("href") + "\t" + link.text.lstrip(key) + '\n')  # 写入数据，文件保存在上面指定的目录，加\n为了换行更方便阅读
                            break
        print("休息" + str(count) + "次")
        count = count + 1
        time.sleep(random()*2)


if __name__ == '__main__':
    s = time.time()
    main()
    e = time.time()
    print('总用时：',e-s)

原理就是爬取网页上所有超链接，在对超链接进行关键词筛选，并保存在文件中。

代码可直接运行。