计算连续登录天数
1.首先查询所有用户登录天数,首先去重
Select user_id, date(login_date) as days
from tmp_login group by user_id, date(login_date);
2.row_number 计算登录时间排序
select user_id,days,row_number () over (partition by user_id order by days) as rn from(
Select user_id, date(login_date) as days
from tmp_login group by user_id, date(login_date)) t1;
3.用登录时间作差,如果得到的日期相同,那么认为是连续登录日期
select *,date_sub(days,interval rn days) as results
from
(
select user_id,days,row_number () over (partition by user_id order by days) as rn from(
Select user_id, date(login_date) as days
from tmp_login group by user_id, date(login_date))
) t2;
4.按作差结果日期分,看相同日期的个数就是连续的
select user_id, count(**) as num_days
from
(
select *, date_sub(days, interval rn day) as results
from(
select user_id, days, row_number() over(partition by user_id order by days) as rn
from (
select distinct user_id, date(login_date) as days from tmp_login) t1
) t2) t3
group by user_id,results;
2.如果要查找连续登录超过n天的用户,n为确定值,那么可以排序,看lead3 天的日期与它是否相等
select user_id,datediff(lead(days,3) over(partition by user order by days) as result from (select distinct user_id,date(login_date) as dyas from tmp_login) t1
3.找出为3天
Result=3
计算留存率
首先建立一张包含每个用户最早登入日期的表
用min()函数与 gropu by
接下来,对用户和日期去重
之后将两张表按照user_id拼接,并且计算日期差
day_diff=1/date_diff=0的为次日
三日 day_diff/date_diff=0
select first_date as dt,
concat(round(100*count(case when day_diff=1 then user_id else 0end)/count(case when day_diff=0 then user_id else 0 end),2)‘%’ ) )
(select user_name,min(date) as first_Day from userbehavior group by user_id) as t1
left join
(select user_id,date from userbehavior group by user_id, date) as t2
on t1.user_id=t2.user_id) as t3
group by first_day
order by first_Day