重要SQL记录

同环比
select '日环比' as datekey,
       if(b.total_poi_num=0,0,round(cast(a.total_poi_num as double)/b.total_poi_num-1,4)) as total_poi_num
  from (
            select datekey,
                   'all' as biz_site,
                   total_poi_num
       ) a
left join (
            select datekey,
                  'all' as  biz_site,
                   total_poi_num
          ) b on a.biz_site = b.biz_site
UNION ALL
select '周同比' as datekey,
       if(b.total_poi_num=0,0,round(cast(a.total_poi_num as double)/b.total_poi_num-1,4)) as total_poi_num,

  from (
            select datekey,
                   'all' as  biz_site,
                   total_poi_num as total_poi_num, 

       ) a
left join (
            select datekey,
                   'all' as  biz_site,
                   total_poi_num as total_poi_num, 

          ) b on a.biz_site = b.biz_site
UNION ALL
select datekey2date('$$yesterday_compact ') as datekey,
       sum(total_poi_num) as total_poi_num,

主要是’all’ as biz_site 这步很经典

行名称转为列名称
select sum(st.name1) name1, sum(st.name4) name4 
from(
select case when user_name = 'name1' then age end name1,
case when user_name = 'name4' then age end name4 
from student_info) as st;

主要是将原来的列里面的名称变成了行名称,其实也就是case when user_name = ‘name1’ then age这步

count(distinct)优化
SELECT day,

COUNT(DISTINCT id) AS uv

FROM user_table

GROUP BY day

`可以转换成:`

SELECT day,

COUNT(id) AS uv

FROM (SELECT day,id FROM user_table GROUP BY day,id) a

GROUP BY day;

数据量小的时候无所谓,数据量大的情况下,由于COUNT DISTINCT操作需要用一个Reduce Task来完成,这一个Reduce需要处理的数据量太大,就会导致整个Job很难完成,一般COUNT DISTINCT使用先GROUP BY再COUNT的方式替换:
虽然会多用一个Job来完成,但在数据量大的情况下,这个绝对是值得的。

多重COUNT(DISTINCT)优化
SELECT day,
       COUNT(DISTINCT id) AS id_count,
       COUNT(DISTINCT name) AS name_count
  FROM user_table
 GROUP BY day

`可以转换成:`
SELECT day,
       SUM(CASE WHEN type='id_count' THEN 1 ELSE 0 END) as id_count,
       SUM(CASE WHEN type='name_count' THEN 1 ELSE 0 END) as name_count
  FROM (
        SELECT day,
               id,
               'id_count' AS type
          FROM user_table
         GROUP BY day,
                  id
     UNION ALL SELECT day,
               name,
               'name_count' AS type
          FROM user_table
         GROUP BY day,
                  name
       ) a
 GROUP BY day;

type的取值个数和原语句中有几个count(distinct)是一致的,和id、name有多少种取值没关系。

— 未完待续

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值