Python检测批量URL状态，并将返回正常的URL保存文件

最新推荐文章于 2023-12-02 00:18:32 发布

转载最新推荐文章于 2023-12-02 00:18:32 发布 · 965 阅读

1 ·

CC 4.0 BY-SA版权

原文链接：http://blog.51cto.com/linuxpython/2105821

文章标签：

#python

本文介绍了一个使用Python编写的脚本，该脚本可以从文件中读取多个URL，并逐一检查其HTTP状态码。对于状态码为200的URL，将其记录在url_ok.txt文件中；对于返回错误状态码或其他异常情况的URL，则将其记录在url_notok.txt文件中。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

#!/usr/bin/python

-- coding: UTF-8 --

#author == huangyishan
import os
import sys
import urllib2
urls = sys.argv[1] #从程序外部调用参数，0即程序本身
result = list()
def check_url_status():
f = open(urls,'r') #以读方式打开文件
for line in f.readlines(): #依次读取每行
line = line.strip() #去掉每行头尾空白
if len(line) !=0:
if line[0:7]=='http://' or line[0:8]=='https://':
pass
else:
line='http://'+line
print line
try:
#response = urllib2.urlopen(line,timeout=4)
status = urllib2.urlopen(line,timeout=4).code
#print response
print status
result.append(line)
open('url_ok.txt', 'w').write('%s' % '\n'.join(result)) #保存入结果文件
except urllib2.HTTPError, e:
print e.code
with open('url_notok.txt', 'w') as f: #保存入结果文件
f.write(line + ' : ' + str(e.code) + '\n')
except:
print "error"
with open('url_notok.txt', 'a') as f: #保存入结果文件
f.write(line + ' : ' + 'error' + '\n')