python-检查网站更新_使用python检测网站有没更新-优快云博客

本文链接：https://blog.youkuaiyun.com/tiantiantdx/article/details/79407755

本文介绍了一个简单的Python脚本，用于监测中国银行业协会网站的协会要闻栏目是否更新了新的内容。通过读取网页并解析HTML代码来提取最新的更新时间和标题。如果检测到更新，则会将新的更新记录写入本地文件。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

以中国银行业协会的协会要闻为例：http://www.china-cba.net/list.php?fid=42

from urllib.request import urlopen
import re
import codecs
import os
web='http://www.china-cba.net/list.php?fid=42.html'
# 读取网页
try:
    file=urlopen(web).read().decode('utf-8')
    print('---编码方式：utf-8---')
except:
    file=urlopen(web).read().decode('gbk')
    print('---编码方式：gbk---')
# 利用正则表达式寻找最新更新时间和标题
start=re.search('<tr>',file).start()
end=re.search('</tr>',file).end()
match=file[start:end]
updatedate=re.search('\(.+\)',match).group()
updatecontent=re.search('[\w\u4e00-\u9fa5]+</a>',match).group()
im=[updatedate,'\n',updatecontent]
# 导出记录
if os.path.exists('银行业协会_协会要闻更新记录'):
    with codecs.open('银行业协会_协会要闻更新记录','r','utf-8') as f:
        input=f.readline()
        if input==updatedate:
            print('没有更新')
        else:
            print('有更新')
            with codecs.open('银行业协会_协会要闻更新记录', 'w', 'utf-8') as f:
                for s in im:
                    f.write(s)
            f.close()
else :
    print('有更新')
    with codecs.open('银行业协会_协会要闻更新记录', 'w', 'utf-8') as f:
        for s in im:
            f.write(s)
    f.close()

PS：关于监测网站更新有专门的模块urlwatch（安装：pip3 install urlwatch），另外网页插件Distill Web Monitor也可以用。