以脚本:爬取网站上无alt属性图片地址为例:
源码:
安装python库:
pip install requests beautifulsoup4
python代码:
import requests
from bs4 import BeautifulSoup
def fetch_imgs_without_alt_or_empty_alt(url):
headers = {
"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36"
}
try:
# Send a GET request with headers
response = requests.get(url, headers=headers, verify=False)
response.raise_for_status() # Raise an exception for HTTP errors
# Parse the HTML content using BeautifulSoup
soup = BeautifulSoup(response.text, 'html.parser')
# Find all <img> tags without an 'alt' attribute or with an empty 'alt' value
imgs = soup.find_all('img')
imgs_without_alt_or_empty = [
img for img in imgs if not img.has_attr('alt') or img.get('alt') == ''
]
# Extract and print the 'src' of these <img> tags
img_sources = [img.get('src') for img in imgs_without_alt_or_empty]
print(f"发现 {len(img_sources)} 张图片没有 alt 属性:")
for src in img_sources:
print(src)
return img_sources
except requests.exceptions.RequestException as e:
print(f"Error fetching the URL: {e}")
return []
# Example usage
if __name__ == "__main__":
url = input("输入地址源: ").strip()
fetch_imgs_without_alt_or_empty_alt(url)
input("\n按任意键退出...")
安装对应打包库:
pip install pyinstaller
输入打包命令:
pyinstaller --onefile 脚本名字.py
打包完成后会在dist文件夹下出现exe文件。