火狐ajax动态json爬取信息

最新推荐文章于 2024-08-12 09:38:18 发布

黑森林法则

最新推荐文章于 2024-08-12 09:38:18 发布

阅读量302

点赞数

分类专栏：爬虫文章标签：火狐 ajax json 爬虫

本文链接：https://blog.youkuaiyun.com/baidu_39384178/article/details/113840302

版权

爬虫专栏收录该内容

4 篇文章

订阅专栏

import requests
import json
import time

while True:
    shijian = time.strftime("%Y-%m-%d %H:%M:%S", time.localtime())
    url2 = '&v=1.0&key=123'
    ur1 = shijian + url2
    ur3 = 'http://cex.xxxxxxx.cn/api.go?action=map&method=listInfo&format=json&timestamp='
    # url = 'http://cex.xxxxxx.cn/api.go?action=map&method=listInfo&format=json&timestamp=2021-02-17 05:25:57:10&v=1.0&key=123'
    url = ur3 + ur1  # 需要爬取网址地址，以上因网页有时间，所以是变动的，如果你们的没有，无须这么麻烦，只需url=    即可

    session = requests.Session()
    headers = {
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:85.0) Gecko/20100101 Firefox/85.0',
        # ---------- ----------       需要修改                         --------------------------------
        'Cookie': 'JSESSIONID=xxxxxx769E3AB3CC09ED9xxxxxxxxx',  # 需要修改cookie  等于号后面内容,即xxxxxx769E3AB3CC09ED9xxxxxxxxx
        # ----------------------       需要修改                        ---------------------------------------
        'Host': 'cex.xxxxx.cn',
        'Content-Type': 'application/json',
        'language': 'zh_cn',
        'X-Requested-With': 'XMLHttpRequest',
        'Referer': 'http://cex.xxxxxxxx.cn/map/index.jsp'
    }

    payload = {"language": "zh_cn", "data": {"type": "listInfo",
                                             "ids": ["xxx41895C8BF09A", "xxx95C8C0EE0", "xxxx95C8C3F7C",
                                                     "xxxx95C8C921A", "003741895C8CB4E3", "xxxxxxC747D01"],
                                             "flags": "0", "min": [105.100376, 29.942863],
                                             "max": [105.264616, 29.948664], "maptype": "1", "pageNum": 1,
                                             "pageSize": 1}}
    # ids表示需要查询数据标识

    response = session.post(url=url, headers=headers, data=json.dumps(payload))

    # response = requests.get(url=url, headers=headers)http://cex.xxxxxxxxx.cn/api.go?action=map&method=listInfo&format=json&timestamp=2021-02-17%2012:36:08&v=1.0&key=123
    response.encoding = 'utf-8'

    res = json.loads(response.text)
    # print(res['language'])

    # for i in res['data']:
    #     print(i)
    shuaxin=10 #间隔多少时间刷新
    for i in res['data']:
        if int(i['speed']) >= 0:   #速度大于数
            print(i['car_name'], i['id'], i['speed'],i['ptime'])#需要反馈的数据标签，因为是json 直接带入标签即可
                      # --------------------存储----------------------
            import csv
            a1=i['car_name']
            a2=i['speed']
            a3=i['ptime']
            sj = time.strftime('%Y%m%d',time.localtime(time.time()))
            dd=[a1,a2,a3]

            with open( sj+'YZ' +'.xls', "a", newline='', encoding='gbk') as file:
                writer = csv.writer(file, delimiter=',')
                writer.writerow(dd)
            # -------------------存储-----------------------


    # time.sleep(shuaxin-1)
    # print( end='', flush=True)


    time.sleep(shuaxin)

如何查找对应的ajax网址火狐浏览器，网络，过滤xhr ，点击post网址。在响应中能查看相应的内容印证。为需要数据，能查找id、speed等信息
未显示完整
在这里插入图片描述 cookie获取地方

payload 的构建地方，这个直接复制到=后即可，再根据实际修改，id为需要提交的数据，所以需要在之前的信息中得到，id即系统查询编码