毛豆汽车数据爬虫--附源码

毛豆汽车数据爬虫–附源码
在这里插入图片描述
没啥教程就是简单的爬虫 加个正则 有疑问公众号后台留言给你处理。
公众号–>python网络小蜘蛛

# -*- endoding: utf-8 -*-
# @ModuleName:毛豆
# @Function(功能):
# @Author : 苏穆冰白月晨
# @Time : 2021/4/7 14:22
import requests
from fake_useragent import UserAgent
import re
import csv

headers = {
    'UserAgent':UserAgent().random,
}

def response(i):
    """请求页面源码"""
    url = "https://www.maodou.com/car-list/all/pg" + str(i)+ "?keyword="
    resposne = requests.get(url, headers=headers).text
    for i in range(0 , 14):
        try:
            response_re(resposne, i)
        except:
            break

def response_re(response, i):
    """正则匹配"""
    guize_shoufu = """<p class="pre-price">首付&nbsp;<em class="hot">(.*?)</em>&nbsp;万</p>"""
    shoufu = re.findall(guize_shoufu, response)[i]

    guize_zhuti = """<span class="info">(.*?)</span></h2>  <div class="car-price">"""
    zhuti = re.findall(guize_zhuti, response)[i]

    guize_yue = """<p class="for-month">(.*?)</p>"""
    yue = re.findall(guize_yue, response)[i]

    guize_tupian = """<img class="lazy" src="(.*?)" data-original=.*?alt=".*?">"""
    tupian = re.findall(guize_tupian, response)[i]

    guize_tupian = """<img class="lazy" src=".*?" data-original=(.*?)alt=".*?">"""
    tupian = re.findall(guize_tupian, response)[i]

    data = {
        "Theme" : zhuti,
        "Down payment" : shoufu,
        "Monthly payment" : yue,
        "Image" : tupian,
    }
    csv_writer.writerow([zhuti, shoufu, yue, tupian])
    print(data)


if __name__ == '__main__':
    f = open('maodou.csv', 'w', encoding='utf-8', newline='')
    csv_writer = csv.writer(f)
    csv_writer.writerow(["Theme", "Down payment", 'Monthly payment', "Image"])
    for i in range(0, 999):
        response(i)
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值