d![]() | |
|---|---|
**
原理如图 SELECT username,
time,SUM(carbon_g) c2,ROW_NUMBER()
OVER(PARTITION by username order bytime) rn FROM
test191128.user_low_carbon GROUP by username,timeHAVING c2 >=100
//t1
------------------------- SELECT username,time,c2,date_sub(regexp_replace(time,’/’,’-’),rn) ds FROM
t1
--------------------------------------- SELECT username,time,c2,date_sub(regexp_replace(time,’/’,’-’),rn) ds FROM
(SELECT username,time,SUM(carbon_g) c2,ROW_NUMBER() OVER(PARTITION
by username order bytime) rn FROM test191128.user_low_carbon GROUP
by username,timeHAVING c2 >=100)t1 //t2
----------------- SELECT username,time,COUNT(ds)OVER(PARTITION by username,ds ORDER by time rows BETWEEN UNBOUNDED PRECEDING and
UNBOUNDED FOLLOWING) cn FROM t2
----------------------------------- SELECT username,time,COUNT(ds)OVER(PARTITION by username,ds ORDER by time
rows BETWEEN UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING) cn FROM
(SELECT username,time,c2,date_sub(regexp_replace(time,’/’,’-’),rn)
ds FROM (SELECT username,time,SUM(carbon_g) c2,ROW_NUMBER()
OVER(PARTITION by username order bytime) rn FROM
test191128.user_low_carbon GROUP by username,timeHAVING c2 >=100)t1
)t2 //t4
------------------------------------- SELECT * FROM t4 WHERE cn >=3
------------------------ SELECT * FROM (SELECT username,time,COUNT(ds)OVER(PARTITION by username,ds ORDER by time
rows BETWEEN UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING) cn FROM
(SELECT username,time,c2,date_sub(regexp_replace(time,’/’,’-’),rn)
ds FROM (SELECT username,time,SUM(carbon_g) c2,ROW_NUMBER()
OVER(PARTITION by username order bytime) rn FROM
test191128.user_low_carbon GROUP by username,timeHAVING c2 >=100)t1
)t2 )t4 WHERE cn >=3 //t5
------------------------------ SELECT FROM user_low_carbon t6 JOIN t5 on t6.username=t5.username and t6.time=t5.time
--------------------------------- 最后的sql SELECT t6.* FROM test191128.user_low_carbon t6 JOIN (SELECT * FROM (SELECT
username,time,COUNT(ds)OVER(PARTITION by username,ds ORDER by time
rows BETWEEN UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING) cn FROM
(SELECT username,time,c2,date_sub(regexp_replace(time,’/’,’-’),rn)
ds FROM (SELECT username,time,SUM(carbon_g) c2,ROW_NUMBER()
OVER(PARTITION by username order bytime) rn FROM
test191128.user_low_carbon GROUP by username,timeHAVING c2 >=100)t1
)t2 )t4 WHERE cn >=3)t5 on t6.username=t5.username and
t6.time=t5.time 蚂蚁森林低碳用户排名分析 问题:查询user_low_carbon表中每日流水记录,条件为:
用户在2017年,连续三天(或以上)的天数里, 每天减少碳排放(low_carbon)都超过100g的用户低碳流水。
需要查询返回满足以上条件的user_low_carbon表中的记录流水。
例如用户u_002符合条件的记录如下,因为2017/1/2~2017/1/5连续四天的碳排放量之和都大于等于100g: 提供的数据说明:
user_low_carbon: u_001 2017/1/1 10 u_001 2017/1/2 150
u_001 2017/1/2 110 u_001 2017/1/2 10 u_001 2017/1/4 50
u_001 2017/1/4 10 u_001 2017/1/6 45 u_001 2017/1/6 90
u_002 2017/1/1 10 u_002 2017/1/2 150 u_002 2017/1/2 70
u_002 2017/1/3 30 u_002 2017/1/3 80 u_002 2017/1/4 150
u_002 2017/1/5 101 u_002 2017/1/6 68
**
本文详细解析了如何通过SQL查询,分析蚂蚁森林中用户连续三天及以上每天碳减排超过100g的低碳行为,展示了从数据筛选到最终结果呈现的全过程。

1383

被折叠的 条评论
为什么被折叠?



