sparkSQL清洗埋点数据(java版)

本文介绍了如何使用SparkSQL进行埋点数据的清洗工作,特别是针对JSON格式的数据。通过Java编写程序,提取关键指标并将其存储到Mysql数据库中。尽管作者久未使用SparkSQL,但发现其功能依然强大且全面。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

数据格式:

{"actionTimes":"2018-11-25","actions":"搜索","bb":"v1.0","fromType":"Chrome/73.0.3683.75","fromURL":"https://www.nyist.com/s?wd=ip%E5%9C%B0%E5%9D%80&rsv_spt=1","ip":"120.50.10.233","requestMethod":"GET","sessionId":"0a1c7b51-4434-4ea0-ada9-d45ba788541c","title":"关键340洞察力","user_id":"6252"}
{"actionTimes":"2018-5-20","actions":"下单","bb":"v1.0","fromType":"Mozilla/5.0 (Windows NT 6.1; Win64; x64)","fromURL":"https://tv.qq.com/channel/child?listpage=1&channel=children&itype=3","ip":"181.94.33.139","requestMethod":"POST","sessionId":"79e33d3f-77e3-4120-ae75-c650e94e22f3","title":"关键273洞察力","user_id":"6106"}
{"actionTimes":"2018-7-21","actions":"下单","bb":"v1.0","fromType":"Mozilla/5.0 (Windows NT 6.1; Win64; x64)","fromURL":"https://www.mail.com/int?wd=ip%E5%9C%B0%E5%9D%80&rsv_spt=1","ip":"181.94.33.139","requestMethod":"GET","sessionId":"6bb2b015-36e7-4b7e-98c9-bc83365ea2cc","title":"关键504洞察力","user_id":"6918"}
{"actionTimes":"2018-9-27","actions":"登录","bb":"v1.0","fromType":"Chrome/73.0.3683.75","fromURL":"https://www.phone.com/int?wd=ip%E5%9C%B0%E5%9D%80&rsv_spt=1","ip":"115.44.31.64","requestMethod":"POST","sessionId":"8fff502c-fa29-4ce3-811b-e4eb944ef62e","title":"关键473洞察力","user_id":"3190"}
{"actionTimes":"2018-7-26","actions":"搜索","bb":"v1.0","fromType":"Chrome/73.0.3683.75","fromURL":"https://www.baidu.com/s?wd=ip%E5%9C%B0%E5%9D%80&rsv_spt=1","ip":"207.12.85.193","requestMethod":"POST","sessionId":"0a28e4aa-8d4a-4335-abad-9ff51f08f7fe","title":"关键371洞察力","user_id":"4569"}
{"actionTimes":"2018-5-5","actions":"浏览评论","bb":"v1.0","fromType":"IE/537.36 (KHTML, like Gecko) ","fromURL":"https://www.nyist.com/s?wd=ip%E5%9C%B0%E5%9D%80&rsv_spt=1","ip":"10.7.87.67","requestMethod":"POST","sessionId":"65af3572-d48b-4e49-aaf4-97f322d6ac10","title":"关键117洞察力","user_id":"212"}
{"actionTimes":"2018-0-19","actions":"搜索","bb":"v1.0","fromType":"Mozilla/5.0 (Windows NT 6.1; Win64; x64)","fromURL":"https://www.phone.com/int?wd=ip%E5%9C%B0%E5%9D%80&rsv_spt=1","ip":"181.94.33.139","requestMethod":"POST","sessionId":"d8c6e3fc-d557-4c26-855b-3053ead690d6","title":"关键34洞察力","user_id":"2367"}
{"actionTimes":"2018-5-27","actions":"下单","bb":"v1.0","fromType":"Chrome/73.0.3683.75","fromURL":"https://www.mail.com/int?wd=ip%E5%9C%B0%E5%9D%80&rsv_spt=1","ip":"241.21.27.237","requestMethod":"GET","sessionId":"8214cc26-d22f-4670-9d63-95ce87814c9a","title":"关键266洞察力","user_id":"5698"}
{"actionTimes":"2018-11-0",
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

掉进悬崖的狼

请博主喝杯奶茶

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值