python爬虫扣代码案例:某智能商业分析平台

声明:
该文章为学习使用,严禁用于商业用途和非法用途,违者后果自负,由此产生的一切后果均与作者无关

一、找出需要加密的参数
  1. js运行 atob(‘aHR0cHM6Ly93d3cucWltYWkuY24vcmFuaw==’) 拿到网址,F12打开调试工具,点击榜单类型,发送请求,鼠标右击请求找到Copy as cUrl(cmd)
    在这里插入图片描述
  2. 打开网站:https://spidertools.cn/#/curl2Request,把拷贝好的curl转成python代码
    在这里插入图片描述
  3. 新建qimai.py文件,把代码复制到该文件内,运行后会发现请求失败,把代码中的analysis替换成浏览器中的analysis参数值,把reponse.text换成reponse.json(),再运行该文件会发现请求成功,并且成功打印出数据
    在这里插入图片描述
  4. 然后把代码中的header,cookie全部注释,再运行文件,会发现数据依然可以请求成功,再把params中的analysis注释会发现数据请求失败,说明header、cookie中不存在加密参数,params中的analysis是加密参数
    在这里插入图片描述##### 二、定位参数加密位置
  5. 首先尝试关键字analysis搜索,会发现没有赋值的地方
    在这里插入图片描述
  6. 切换到sources,添加XHR拦截 rank/index?analysis=
    在这里插入图片描述
  7. 重新点击榜单类型发送请求,一直点击跳到下一个函数,并观察scope中参数没有analysis时停止点击,观察 o()[Kt][Ut][Ft] 函数,鼠标悬浮到Kt和Ut会发下该函数是请求拦截器
    在这里插入图片描述
  8. 跳过断店调试,重新点击榜单类型发送请求,观察这段代码 p + B1 + zV1,鼠标分别悬浮会发现:p是analysis、B1是等号、zV1是值。把e在控制台输出
    在这里插入图片描述
  9. 跳过断店调试,完成本次请求,切换到network,查看最新的rank/index请求,会发现请求中的analysis和控制台中的e值是一样,说明analysis加密位置是在第3步的拦截器中
    在这里插入图片描述
三、扣加密参数代码
  1. 新建qimai.js文件,用于放扣下的js代码
  2. 点击榜单类型发送请求,会发现变量e是由 (0, i[jt])((0, i[qt])(a, d)) 生成,在控制台中可以看出该代码可以简化为:i[jt](i[qt](a, d)),在控制台中打印出a和d
    在这里插入图片描述
  3. 把代码复制到qimai.js文件中
    在这里插入图片描述
  4. 扣 i[qt] 方法代码
  • 鼠标悬浮 qt 会发现,qt 是 i 中的 oz = f h(n,t) 方法,鼠标悬浮 i 上展开 oz 方法点击蓝色部分找到该方法,把 h(n,t) 方法拷贝到qimai.js文件中,并把 i[qt] 替换成 h,因为 t 参数是必须的,所以删除 t = t || u() 代码
    在这里插入图片描述
    在这里插入图片描述
  • 分析 h(n,t) 函数代码,其中 n 和 t 是参数,代码内部的 n 和 t 都可以忽略,鼠标悬浮到 z 上发现是windows对象,需要注意的有:_、$1、R、q1、H、o、I1,分别把他们输出到控制台会发现 $1、R、I1、q1是js中的方法;_是个空字符串;o是个方法。先把确定的值在qimai.js文件中替换
    在这里插入图片描述在这里插入图片描述
  • 滚动鼠标向上找会找到 o 方法,鼠标悬浮到 z上会发现是windows对象,打印出的 b2 是js中的String对象,而 windows.String 并没有像打印出 e 这种方法,所以需要在 o 方法上打断点进行分析
    在这里插入图片描述
  • 结束此次断点调试,点击榜单类型重新发送请求,进入 o 方法,打印 e,发现 z[b2][e] 是 windows.String.fromCharCode 方法,把 o 方法替换为 windows.String.fromCharCode ,至此 i[qt] 方法完全扣出
    在这里插入图片描述在这里插入图片描述
  1. 结束此次断点,开始扣 i[jt] 代码
  • 鼠标悬浮 jt 会发现,jt 是 i 中的 cv = f v(t) 方法,鼠标悬浮 i 上展开 cv 方法点击蓝色部分找到该方法,把 v(t) 方法拷贝到qimai.js文件中,并把 i[jt] 替换成 v
    在这里插入图片描述
    在这里插入图片描述
  • 分析 v(t) 函数代码,t 是参数,代码内内部的 t 可以忽略,,鼠标悬浮到 z 上发现是windows对象,需要注意的有:V1、T、Y1、Q1、W1、K1、U1、Z1,分别把他们输出到控制台会发现 V1、T、Q1、W1、U1是js中的方法;Y1、Z1是变量;o 方法之前分析过为 windows.String.fromCharCode;先把确定的值在qimai.js文件中替换
    在这里插入图片描述在这里插入图片描述
  • Ctrl+F 在该js文件全局搜下 Y1,找到 Y1 第一次出现的位置会发现q1是定义好的变量,找到Y1变量赋值的位置:Y1 = n(42, 47),鼠标悬浮到 n 会发现为 undefined,分析整个js文件会发现 n 是匿名函数的参数,而 funtion(a) 函数正好对映参数 n,把 funtion(a) 函数复制到qimai.js文件并命名为 get_var,再把 Y1 方法更改为 Y1 = get_var(42, 47) 写入qimai.js文件
    在这里插入图片描述在这里插入图片描述在这里插入图片描述
  • Ctrl+F 在该js文件全局搜下 Z1,发现 Z1 = n(6, 7, 2, 5, 33, 36),再把 Y1 方法更改为 Z1 = get_var(6, 7, 2, 5, 33, 36) 写入qimai.js文件,至此 i[jt] 方法完全扣出
    在这里插入图片描述在这里插入图片描述
四、验证结果
  1. 在控制台打印出变量 e
    在这里插入图片描述
  2. 修改qimai.js代码,在开发者工具中打印出 e,会发现同样的 a 和 d 值生成的 e 值也相同
    在这里插入图片描述
  3. 修改 qimai.py 文件运行,发现数据获取成功,如果发现 a、d、e 值相同却请求失败请确认params和url 是否正确,因为a、d是由params和url生成的
    在这里插入图片描述
五、最终代码

上面大部分js代码已经扣完,这里a、d值不做分析及扣代码,大家可以按照之前的流程自己扣下a、d值,完整代码如下

  1. qimai.js
var get_var = (function (a) {
    return function () {
        for (var n = arguments, t = "", e = 0, r = n.length; e < r; e++)
            t += a[n[e]];
        return t
    }
}(["p", "u", "s", "h", "w", "e", "b", "a", "c", "k", "C", "n", "q", "i", "m", "l", "f", "d", "$", "S", "r", "v", "o", "t", "y", "g", "A", "F", "R", "z", "T", "M", "E", "6", "7", "2", "4", "3", "8", "1", "5", "%", "0", "9", ".", "j", "Z", "x", "#", "I", "O", "P", "L", "W", "H", "_", "V", "\u672a", "\u8986", "\u76d6", "\u6570", "\u636e", "(", "^", "|", " ", ")", "=", "[", ";", "]", "*", "D", "G", "/", "U", "B", "Q", "N", "-", "?", "\u767b", "\u5f55", "K", ",", "J", "\u963f", "\u5bcc", "\u6c57", "\u5b89", "\u54e5", "\u62c9", "\u5c14", "\u5df4", "\u5c3c", "\u4e9a", "\u8054", "\u914b", "\u6839", "\u5ef7", "\u7f8e", "\u6cd5", "\u5c5e", "\u5357", "\u534a", "\u7403", "\u548c", "\u6781", "\u9886", "\u5730", "\u6fb3", "\u5927", "\u5229", "\u5965", "\u585e", "\u62dc", "\u7586", "\u5e03", "\u9686", "\u8fea", "\u6bd4", "\u65f6", "\u8d1d", "\u5b81", "\u57fa", "\u7eb3", "\u7d22", "\u5b5f", "\u52a0", "\u56fd", "\u4fdd", "\u54c8", "\u9a6c", "\u6ce2", "\u65af", "\u9ed1", "\u7ef4", "\u90a3", "\u767d", "\u4fc4", "\u7f57", "\u4f2f", "\u5179", "\u767e", "\u6155", "\u73bb", "\u897f", "\u6587", "\u83b1", "\u4e0d", "\u4e39", "\u535a", "\u8328", "\u74e6", "\u4e2d", "\u975e", "\u5171", "\u62ff", "\u745e", "\u58eb", "\u667a", "\u53f0", "\u6e7e", "\u8c61", "\u7259", "\u6d77", "\u5cb8", "\u5580", "\u9ea6", "\u521a", "\u679c", "\u6c11", "\u4e3b", "\u4f26", "\u8fbe", "\u9ece", "\u53e4", "\u5317", "\u6d66", "\u8def", "\u6377", "\u514b", "\u5fb7", "\u5409", "\u63d0", "\u591a", "\u660e", "\u53ca", "\u5384", "\u74dc", "\u57c3", "\u7acb", "\u7279", "\u91cc", "\u73ed", "\u7231", "\u6c99", "\u82ac", "\u5170", "\u6590", "\u798f", "\u7fa4", "\u5c9b", "\u84ec", "\u82f1", "\u683c", "\u9c81", "\u51e0", "\u5185", "\u5188", "\u7ecd", "\u8d64", "\u9053", "\u5e0c", "\u814a", "\u9675", "\u5371", "\u572d", "\u6d2a", "\u90fd", "\u5308", "\u5370", "\u5ea6", "\u4f0a", "\u6717", "\u51b0", "\u4ee5", "\u8272", "\u5217", "\u610f", "\u4e70", "\u7ea6", "\u65e6", "\u65e5", "\u672c", "\u8428", "\u5766", "\u80af", "\u67ec", "\u57d4", "\u5be8", "\u97e9", "\u79d1", "\u6c83", "\u5a01", "\u8001", "\u631d", "\u5ae9", "\u5361", "\u6258", "\u9676", "\u5b9b", "\u5362", "\u68ee", "\u5821", "\u8131", "\u6469", "\u6d1b", "\u58a8", "\u5176", "\u987f", "\u4ee3", "\u592b", "\u7f05", "\u7538", "\u5c71", "\u8499", "\u83ab", "\u6851", "\u6bdb", "\u5854", "\u6765", "\u7c73", "\u65b0", "\u8377", "\u632a", "\u6cca", "\u66fc", "\u79d8", "\u83f2", "\u5f8b", "\u5bbe", "\u5404", "\u671d", "\u9c9c", "\u8461", "\u8404", "\u65fa", "\u6492", "\u82cf", "\u6240", "\u95e8", "\u6602", "\u4f10", "\u5178", "\u53d9", "\u4e4d", "\u5f97", "\u6cf0", "\u571f", "\u5e93", "\u4e1c", "\u5e1d", "\u6c76", "\u7a81", "\u8033", "\u5408", "\u4e4c", "\u5e72", "\u522b", "\u59d4", "\u8d8a", "\u52aa", "\u56fe", "\u4e5f", "\u8d5e", "\u6d25", "\u97e6", "\u4eca", "\u8fd1", "\u4e00", "\u5468", "\u4e2a", "\u6708", "\u4e09", "\u5e74", "\u81ea", "\u5b9a", "\u4e49", "Y", "\u6628", "\u5c0f", "\u5929", "\u91cf", "\u4ebf", "\u70b9", "\u8f7d", "\u5206", "<", ">", '"', ":", "\u5b58", "\u4e3a", "\u7247", "&", "\u7f51", "\u7edc", "\u9519", "\u8bef", "\u5b57", "\u6bb5", "\u662f", "\u7b26", "\u4e32", "\u7c7b", "\u578b", "@", "\u626b", "\u7801", "\u652f", "\u4ed8", "\u5fae", "\u4fe1", "\u5143", "\u70ed", "\u5bb6", "\u4ed6", "\u5353", "\u5e02", "\u573a", "\u7cbe", "\u54c1", "\u63a8", "\u8350", "\u5e94", "\u7528", "\u589e", "\u7684", "\u4e24", "\u4e0e", "\u6e38", "\u620f", "\u56db", "\u4e0a", "\u67b6", "\u76d1", "\u63a7", "\u4e0b", "\u6e05", "\u699c", "\u8bcd", "\u5355", "\u66f4", "\u641c", "\u6307", "\u5dde", "!", "\u5b9e", "\u6392", "\u540d", "\u884c", "\u5168", "\u89c8", "\u5347", "\u964d", "\u6700", "\u5feb", "\u82f9", "\u5173", "\u952e", "\u4f18", "\u5316", "\u9884", "\u8ba2", "\u53d7", "\u8bbf", "\u9875", "\u9762", "{", "}", "'", "+", "\n", "\u9996", "\u9738", "\u5f53", "\u524d", "\u5728", "\u7ebf", "\u4e07", "\uff0c", "\u8bf7", "\u7a0d", "\u540e", "\u91cd", "\u8bd5", "\u53d1", "\u9001", "\u6210", "\u529f", "\u4f20", "\u5165", "\u9700", "\u8981", "\u8282", "\u6d41", "\u51fd", "\u5305", "\u542b", "\u6027", "\u5bf9", "\u521d", "\u59cb", "X", "\u67e5", "\u770b"]))


var H = 0;
var Y1 = get_var(42, 47);
var Z1 = get_var(6, 7, 2, 5, 33, 36);
var F = null;
var O1 = get_var(41);
var J1 = get_var(42, 42);
var Nt = get_var(50, 27, 22, 67);
var B = 1;
var Rt = get_var(12, 13, 14, 7, 13, 357, 35, 42, 35, 35, 345, 30, 5, 8, 3, 11, 22, 15, 22, 25, 24);
var n3 = get_var(47, 24, 29);
var t3 = get_var(5, 16, 25, 3);

function h(n, t) {
    for (var e = (n = n.split('')).length, r = t.length, a = 'charCodeAt', i = H; i < e; i++) {
        n[i] = String.fromCharCode(n[i][a](H) ^ t[(i + 10) % r][a](H));
    }
    return n.join('')
}

function v(t) {
    t = encodeURIComponent(t).replace(/%([0-9A-F]{2})/g, function (n, t) {
        return String.fromCharCode(Y1 + t)
    });
    try {
        return btoa(t)
    } catch (n) {
        return Buffer.from(t).toString(Z1)
    }
}

function p(n, t) {
    for (var e = (n = n.split('')).length, r = t.length, a = 'charCodeAt', i = H; i < e; i++)
        n[i] = String.fromCharCode(n[i][a](H) ^ t[i % r][a](H));
    return n.join('')
}

function l(n) {
    return decodeURIComponent(function (t) {
        try {
            return atob(t)
        } catch (n) {
            return Buffer.from(t, Z1).toString()
        }
    }(n).split('').map(function (n) {
        return O1 + (J1 + n.charCodeAt(H).toString(16)).slice(-2)
    }).join(''))
}

function y(n, t, e) {
    for (var r = void 0 === e ? 2166136261 : e, a = H, i = n.length; a < i; a++)
        r = (r ^= n.charCodeAt(a)) + ((r << B) + (r << 4) + (r << 7) + (r << 8) + (r << 24));
    return t ? (n3 + (r >>> H).toString(16) + t3).substr(-16) : r >>> H
}

function main(t) {
    var d = y(Rt, B);
    var v1 = p(l(Nt), d);
    var r = +new Date() - H - 1661224081041, a = [];
    Object.keys(t.params).forEach(function (n) {
        t.params.hasOwnProperty(n) && a.push(t.params[n])
    })
    a = a.sort().join('');
    a = v(a);
    a = (a += v1 + t.url.replace(t.baseURL, '')) + (v1 + r) + (v1 + 3);
    var e = v(h(a, d))
    console.log(e)
    return e;
}

// main({
//     baseURL: "https://api.qimai.cn",
//     url: "/rank/index",
//     params: {
//         brand: "free",
//         device: "iphone",
//         country: "cn",
//         genre: "5000"
//     }
// })
  1. qimai.py
import requests
import execjs
import furl


headers = {
    "authority": "api.qimai.cn",
    "accept": "application/json, text/plain, */*",
    "accept-language": "zh-CN,zh;q=0.9",
    "cache-control": "no-cache",
    "origin": "https://www.qimai.cn",
    "pragma": "no-cache",
    "sec-ch-ua": "^\\^Google",
    "sec-ch-ua-mobile": "?0",
    "sec-ch-ua-platform": "^\\^Windows^^",
    "sec-fetch-dest": "empty",
    "sec-fetch-mode": "cors",
    "sec-fetch-site": "same-site",
    "user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/119.0.0.0 Safari/537.36"
}
cookies = {
    "qm_check": "A1sdRUIQChtxen8pI0dAMRcOUFseEHBeQF0JTjVBWCwycRd1QlhAXFEGFUdASAFKBQcCCXsEBRFFIg4aHRoOBnMDARlGR2dQOVdICAolAGgCHBl0B3xUV05KVFsZXVJRWxsKFghJVktYVElWBRVP",
    "gr_user_id": "c058d5d9-eebf-4efb-bdb4-191ec0912dfb",
    "PHPSESSID": "3qbifh80k5joj5lgj3dsvk61t9",
    "ada35577182650f1_gr_session_id": "7ffd8bf0-305c-4381-84e5-3ef51fffec19",
    "ada35577182650f1_gr_session_id_sent_vst": "7ffd8bf0-305c-4381-84e5-3ef51fffec19",
    "synct": "1699099063.693",
    "syncd": "-385"
}
url = "https://api.qimai.cn/rank/index"
params = {
    # "analysis": "eDEnECU/Nw9vX30PPjZVQQQhXh0iKEcIcRRMFgBXXUoPCQwdAToWAgBbU1QIBldUVVE4Wkk=",
    "brand": "free",
    "device": "iphone",
    "country": "cn",
    "genre": "5000"
}

with open('qimai.js','r') as js_file:
    js = execjs.compile(js_file.read())
    url_info = furl.furl(url)
    generate_analysis_par = {
        'baseURL': f'${str(url_info.scheme)}://${url_info.host}',
        'url': str(url_info.path),
        'params': params
    }
    params['analysis'] = js.call('main',generate_analysis_par)
    print(params)
    response = requests.get(url, headers=headers, cookies=cookies, params=params)
    print(response.json())
    print(response)
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

局外人LZ

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值