Fiddler抓取微信公众号数据

 

写这篇博文的主要目的,记录我使用Fiddler 抓包工具完成公众号请求信息的抓取,并解析抓取的数据的全过程。

 

准备工作:

下载:Fiddler_5.0.20173.49666_Setup.exe

官网链接:https://www.telerik.com/download/fiddler

 

 

1.安装Fiddler_5.0.20173.49666_Setup.exe,很简单,打开效果如下图:

2.生成证书文件FiddlerRoot.cer

     在菜单栏中依次选择 【Tools】->【Options】->【HTTPS】,勾上如下图的选项

    然后点击【Actions】选择导出证书到桌面

3.手动安装证书

   在fiddler目录下有一个makecert.exe ,创建myTest.bat 内容如下:

makecert.exe -r -ss my -n “CN=DO_NOT_TRUST_FiddlerRoot, O=DO_NOT_TRUST, OU=Created by http://www.fiddler2.com” -sky signature -eku 1.3.6.1.5.5.7.3.1 -h 1 -cy authority -a sha1 -m 120 -b 09/05/2012

4.抓取我想要的微信公众号的数据
  a.原理:fiddler工具为我们提供了请求前的方法和请求响应后的方法

OnBeforeRequest(),OnBeforeResponse()

  b.配置抓取规则

     选择菜单【rules 】--- >【customs rules】选项,然后重启一下进入到如图所示的界面

     修改OnBeforeRequest()

<span style="color:#663300"> if (oSession.fullUrl.Contains("mp.weixin.qq.com"))
 {
     var fso;
     var file;
     fso = new ActiveXObject("Scripting.FileSystemObject");
     //文件保存路径,可自定义
     file = fso.OpenTextFile("c:\\Sessions.txt",8 ,true, true);
     file.writeLine("Request url: " + oSession.url);
     file.writeLine("Request header:" + "\n" + oSession.oRequest.headers);
     file.writeLine("Request body: " + oSession.GetRequestBodyAsString());
     file.writeLine("\n");
     file.close();
 }</span>

   修改OnBeforeResponse()

<span style="color:#663300">if(oSession.fullUrl.Contains("weixin/searchShiFu.php"))
        {
		 oSession.utilDecodeResponse();//消除保存的请求可能存在乱码的情况
            var fso;
            var file;
            fso=new ActiveXObject("Scripting.FileSystemObject");
            //文件保存路径,可自定义

            file=fso.OpenTextFile("d:\\Response.txt",8,true,true);
            //file.writeLine("Response code: "+oSession.responseCode);
            file.writeLine("Response body: "+oSession.GetResponseBodyAsString());
            file.writeLine("\n");
            file.close();
        }</span>

   保存退出,重启fiddler即可使用。

5.解析抓取的内容

   a.响应获取的解析数据d:\\Response.txt中,内容如下:

Response body: {"nickname":"秦人","totalTimes":33,"todayTimes":2,"total":598,"thisNum":8,"yuanjin":1,"oneSF":[{"juli":"2.1 公里","name":"马帅军","phone":"15529016011","address":"陕西西安未央区建章路","longitude_S":"108.848384","latitude_S":"34.318548","jianjie":"工龄4年。     施工20~25元/卷","s1":"","s2":"","s3":"","sa":"","p1":"","p2":"","p3":"","pa":"","uid":"12527","weixin":"0","headimgurl":""},{"juli":"2.4 公里","name":"张帅","phone":"13571547952","address":"陕西西安施工范围。全。","longitude_S":"108.893215","latitude_S":"34.332735","jianjie":"无妨壁纸25元一卷。长纤,蚕丝等30元一卷。壁画20元一平。壁布10元一平。","s1":"","s2":"","s3":"","sa":"","p1":"","p2":"","p3":"","pa":"","uid":"14235","weixin":"1","headimgurl":"http://wx.qlogo.cn/mmopen/PiajxSqBRaEJkMDHthV4HGnCWtEk7TCTvDOQUId5uvHaOZkzxN8nRJv8C7YicFia8KibNhvyjW0NL6WiboPhw6X6VqA/64"},{"juli":"2.9 公里","name":"黄师傅","phone":"13289380958","address":"西安西安市","longitude_S":"108.894098","latitude_S":"34.305800","jianjie":"工龄:6年,本人从事墙纸粘贴行业已有6年,积累了丰富的墙纸施工方面的经验和技术能对各种高中","s1":"","s2":"","s3":"","sa":"","p1":"","p2":"","p3":"","pa":"","uid":"2248","weixin":"0","headimgurl":""},{"juli":"3 公里","name":"尚俊","phone":"18802920027","address":"陕西西安未央区汉城街办西查村","longitude_S":"108.900890","latitude_S":"34.331328","jianjie":"6年工龄,无纺纸20其他25。团队6人,随时准备24小时为您服务。","s1":"","s2":"","s3":"","sa":"","p1":"","p2":"","p3":"","pa":"","uid":"13814","weixin":"0","headimgurl":""},{"juli":"3.2 公里","name":"刘小虎","phone":"18710629117","address":"陕西咸阳武功县普集镇令新村","longitude_S":"108.904032","latitude_S":"34.316132","jianjie":"","s1":"","s2":"","s3":"","sa":"","p1":"","p2":"","p3":"","pa":"","uid":"11609","weixin":"0","headimgurl":""},{"juli":"3.6 公里","name":"张跃武","phone":"13772014639","address":"陕西西安莲湖区莲湖区邓家村小学","longitude_S":"108.880513","latitude_S":"34.292046","jianjie":"贴了6年壁纸,工费是根据纸的材料而定,合作搭档两人,","s1":"","s2":"","s3":"","sa":"","p1":"","p2":"","p3":"","pa":"","uid":"5794","weixin":"0","headimgurl":""},{"juli":"3.6 公里","name":"小何","phone":"15291480050","address":"陕西西安凤城四路","longitude_S":"108.909152","latitude_S":"34.324399","jianjie":"","s1":"","s2":"","s3":"","sa":"","p1":"","p2":"","p3":"","pa":"","uid":"13726","weixin":"0","headimgurl":""},{"juli":"3.6 公里","name":"魏师傅","phone":"15291814440","address":"西安西安市","longitude_S":"108.904098","latitude_S":"34.306800","jianjie":"工龄:7年,专业","s1":"","s2":"","s3":"","sa":"","p1":"","p2":"","p3":"","pa":"","uid":"2241","weixin":"0","headimgurl":""}]}

   我的目的是从上面的响应数据获取到name,phone,addrss的信息;

   备注:默认生成的Response.txt文件的字符集是ucs-2 little endian ,在java中的字符集类型为:UTF-16LE

 

   b.解析Response.txt文本内容,输出name,phone,addrss信息到D:\\Handle_Response.txt

     我用java实现文本内容的解析,代码如下:

package com.wang.readText;

import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import java.io.InputStreamReader;
import java.io.OutputStreamWriter;
import java.io.UnsupportedEncodingException;
import java.util.ArrayList;
import java.util.HashMap;
import java.util.List;
import java.util.Map;
import java.util.Set;

import com.alibaba.fastjson.JSONArray;
import com.alibaba.fastjson.JSONObject;

public class ReadTextUtils {

    private static List<UserInfo> resultList = new ArrayList<>();
    private static String SRC_PATH = "D:/Response.txt";
    private static String OUT_PATH = "D:/Handle_Response.txt";

    public static void main(String[] args) {
    	
    	
    	String srcPath = args[0];
    	String outPath = args[1];
//    	String srcPath = SRC_PATH;
//    	String outPath = OUT_PATH;
        readTxtContent(srcPath);
        writeTxtContent(outPath);
          
    }
    
    public static void readTxtContent(String srcPath){

        /* 读取数据 */
        try {
            BufferedReader br = new BufferedReader(new InputStreamReader(new FileInputStream(new File(srcPath)),"UTF-16LE"));
            String lineTxt = null;
            while ((lineTxt = br.readLine()) != null) {
            	if(!"".equals(lineTxt)) {
            		lineTxt = lineTxt.substring(15);
            		JSONObject object = (JSONObject) JSONObject.parse(lineTxt);
            		if(!object.get("oneSF").equals(0)) {
            			JSONArray jsonArray =  (JSONArray)object.get("oneSF");
                		String jsonarrayString = jsonArray.toJSONString();
                		List<UserInfo> userList=JSONArray.parseArray(jsonarrayString, UserInfo.class);
                		resultList.addAll(userList);
                		System.out.println("--read line data count---"+userList.size());
            		}
            	}
            }
            br.close();
        } catch (Exception e) {
            System.err.println("read errors :" + e);
        }
    }

    public  static List<UserInfo> quchongfu() {
    	HashMap <String,UserInfo> userMap = new HashMap<>();
    	List<UserInfo> userInfoList = new ArrayList<>();
    	for(UserInfo userInfo :resultList) {
    		userMap.put(userInfo.getPhone(), userInfo);
    	}
    	Set<String> keySet = userMap.keySet();
    	for(String str:keySet) {
    		userInfoList.add(userMap.get(str));
    	}
    	return userInfoList;
    }
    public static void writeTxtContent(String outPath){
        /* 输出数据 */
        try {
            BufferedWriter bw = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(new File(outPath)),"UTF-8"));

            bw.write("name\t\t    phone\t\t\t    address");
            bw.newLine();
            for(UserInfo userInfo :quchongfu()){
                bw.write(userInfo.getName()+"\t\t "+userInfo.getPhone()+"\t\t "+userInfo.getAddress());
                bw.newLine();
            }
            bw.close();
        } catch (Exception e) {
            System.err.println("write errors :" + e);
        }
    }


}

   输出的文本内容如下图:

  6.一键式执行数据处理:

     a.将ReadTextUtils类打包成可执行的jar;

 

    b.编写简单的runReadTxt.bat文件,内容如下:

   

@echo off
echo -----------read infos-----------  
java -jar %cd%\"readTxt.jar" %cd%\"Response.txt" %cd%\"Handle_Response.txt"
echo ---------------finish!!!-----------------------------  
PAUSE

   c.整体一键式运行小工具搞定;备注:运行jar的前提是要安装java运行环境

    

     希望能帮到你,欢迎指正!

     参考文章:http://blog.youkuaiyun.com/lhorse003/article/details/72473212

 

推荐

自己搭建了一套logoly环境,欢迎搭建来体验。

http://www.mhtclub.com/logoDesign/

也欢迎朋友们来我的博客逛逛!

http://www.mhtclub.com

一位朋友的人工智能教程。零基础,通俗易懂!

 

Fiddler是一个流行的网络调试工具,常用于抓取HTTP和HTTPS流量,帮助开发者理解和调试Web应用。在抓取微信小程序的数据时,由于微信小程序采用了特殊的通信协议(如WXML, WXSS, JavaScript Bundle等)和加密机制,直接使用Fiddler可能会遇到一些挑战: 1. **跨域限制**:微信小程序默认是不允许跨域请求的,你需要配置微信开发者工具的“安全域名”设置,才能允许特定域名的请求被Fiddler截获。 2. **加密处理**:微信提供了自家的加密套件,如TLS 1.3和随机数生成,这可能使Fiddler难以直接解析响应内容,你可能需要解密工具或者了解微信的安全策略。 3. **特殊接口**:微信小程序的数据请求可能通过微信提供的API或者其他自定义接口,这些接口通常是隐藏的或有权限限制,Fiddler可能无法直接跟踪。 4. **调试工具**:微信提供了微信开发者工具,其中内置了调试网络的能力,你可以尝试在开发工具中查看和调试小程序的数据交互。 **步骤概述**: - 配置微信开发者工具:确保你已经设置了正确的安全域名,并开启调试模式。 - 使用微信开发者工具的网络面板:在模拟器或真机上查看小程序的数据请求。 - 如果需要抓取,可以在开发者工具的请求日志中找到相关请求,然后在Fiddler中设置代理服务器(比如127.0.0.1:8888)来拦截并分析请求。 **相关问题**: 1. 如何在微信开发者工具中设置安全域名? 2. 微信小程序的加密套件具体是什么? 3. Fiddler如何设置代理服务器来拦截微信请求?
评论 13
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

不安分的猿人

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值