Hive实战案例3——级联求和accumulate
需求:
有如下访客访问次数统计表 t_access_times
访客 | 月份 | 访问次数 |
A | 2015-01 | 5 |
A | 2015-01 | 15 |
B | 2015-01 | 5 |
A | 2015-01 | 8 |
B | 2015-01 | 25 |
A | 2015-01 | 5 |
A | 2015-02 | 4 |
A | 2015-02 | 6 |
B | 2015-02 | 10 |
B | 2015-02 | 5 |
…… | …… | …… |
需要输出报表:t_access_times_accumulate
访客 | 月份 | 月访问总计 | 累计访问总计 |
A | 2015-01 | 33 | 33 |
A | 2015-02 | 10 | 43 |
……. | ……. | ……. | ……. |
B | 2015-01 | 30 | 30 |
B | 2015-02 | 15 | 45 |
……. | ……. | ……. | ……. |
实现步骤
可以用一个hql语句即可实现:
select A.username,A.month,max(A.salary) as salary,sum(B.salary) as accumulate from (select username,month,sum(salary) as salary from t_access_times group by username,month) A inner join (select username,month,sum(salary) as salary from t_access_times group by username,month) B on A.username=B.username where B.month <= A.month group by A.username,A.month order by A.username,A.month; |
本文介绍如何使用Hive SQL实现级联求和(accumulate),通过一个简单的例子展示了如何计算每位访客每月的访问次数总计及累计访问总计。利用两次自连接查询,实现了高效的数据汇总。
1052

被折叠的 条评论
为什么被折叠?



