backgroud: our dvertiser provide on device list of idfa to show ad to target audience,however none of the ad shows ,so we want to know how many public device id in our traffic request。
to find the public deviceid,we need to get all device id(idfa/google adid) in one day .
method1: use map reduce on azkaban ,however it failed .
method2: use hive tables; insert the deviceidlist to one table and join deviceids .
method3: select all distinct deviceids from request log and output as a file , about 0.2 billion deviceid list and file size 6G.
then use shell command just as this :
grep -F -f a.txt b.txt > public_ids.txt
then ,we get the public deviceids .
refer:http://blog.youkuaiyun.com/autofei/article/details/6579320
本文介绍了一种通过分析广告请求流量来查找公共设备ID的方法。详细讨论了三种不同的技术方案:使用MapReduce、Hive表以及从请求日志中选择所有不同的设备ID。最终通过shell命令实现了目标。
4984

被折叠的 条评论
为什么被折叠?



