在平时的工作中,经常有按照不同维度筛选和统计数据的需求。拿视频会员订单数据来说吧,运营人员要查看深圳市的成功下单数或则深圳市某一种产品的成功下单数或者某一种产品的所有成功下单数时,每天的订单数又很大,现查的话按照不同的维度去查询又很慢。此时本篇文章或许会帮助到你。
group by:主要用来对查询的结果进行分组,相同组合的分组条件在结果集中只显示一行记录。可以添加聚合函数。
grouping sets:对分组集中指定的组表达式的每个子集执行group by,group by A,B grouping sets(A,B)就等价于 group by A union group by B,其中A和B也可以是一个集合,比如group by A,B,C grouping sets((A,B),(A,C))。
rollup:在指定表达式的每个层次级别创建分组集。group by A,B,C with rollup首先会对(A、B、C)进行group by,然后对(A、B)进行group by,然后是(A)进行group by,最后对全表进行group by操作。
cube:为指定表达式集的每个可能组合创建分组集。首先会对(A、B、C)进行group by,然后依次是(A、B),(A、C),(A),(B、C),(B),( C),最后对全表进行group by操作。
数据库中会员订单的数据映射的对象如下:
<span style="color:#000000"><code class="language-java"><span style="color:#c678dd">case</span> <span style="color:#c678dd">class</span> MemberOrderInfo<span style="color:#999999">(</span>area<span style="color:#669900">:</span>String<span style="color:#999999">,</span>memberType<span style="color:#669900">:</span>String<span style="color:#999999">,</span>product<span style="color:#669900">:</span>String<span style="color:#999999">,</span>price<span style="color:#669900">:</span>Int<span style="color:#999999">)</span>
</code></span>
- 1
会员订单表中的数据如下:
<span style="color:#000000"><code class="language-java"> <span style="color:#c678dd">import</span> sqlContext<span style="color:#999999">.</span>implicits<span style="color:#999999">.</span>_
val orders<span style="color:#669900">=</span><span style="color:#61aeee">Seq</span><span style="color:#999999">(</span>
<span style="color:#61aeee">MemberOrderInfo</span><span style="color:#999999">(</span><span style="color:#669900">"深圳"</span><span style="color:#999999">,</span><span style="color:#669900">"钻石会员"</span><span style="color:#999999">,</span><span style="color:#669900">"钻石会员1个月"</span><span style="color:#999999">,</span><span style="color:#98c379">25</span><span style="color:#999999">)</span><span style="color:#999999">,</span>
<span style="color:#61aeee">MemberOrderInfo</span><span style="color:#999999">(</span><span style="color:#669900">"深圳"</span><span style="color:#999999">,</span><span style="color:#669900">"钻石会员"</span><span style="color:#999999">,</span><span style="color:#669900">"钻石会员1个月"</span><span style="color:#999999">,</span><span style="color:#98c379">25</span><span style="color:#999999">)</span><span style="color:#999999">,</span>
<span style="color:#61aeee">MemberOrderInfo</span><span style="color:#999999">(</span><span style="color:#669900">"深圳"</span><span style="color:#999999">,</span><span style="color:#669900">"钻石会员"</span><span style="color:#999999">,</span><span style="color:#669900">"钻石会员3个月"</span><span style="color:#999999">,</span><span style="color:#98c379">70</span><span style="color:#999999">)</span><span style="color:#999999">,</span>
<span style="color:#61aeee">MemberOrderInfo</span><span style="color:#999999">(</span><span style="color:#669900">"深圳"</span

本文介绍SQL中的groupby、groupingsets、rollup和cube的概念及使用方法,通过实例展示如何运用这些工具对视频会员订单数据进行多维度统计分析。
最低0.47元/天 解锁文章
1049

被折叠的 条评论
为什么被折叠?



