【SQL】日期函数、窗口函数的统计运用

题目

牛客网SQL160——国庆期间每类视频点赞量和转发量

题目描述

用户-视频互动表tb_user_video_log

iduidvideo_idstart_timeend_timeif_followif_likeif_retweetcomment_id
110120012021-09-24 10:00:002021-09-24 10:00:20110NULL
210520022021-09-25 11:00:002021-09-25 11:00:30001NULL
310220022021-09-25 11:00:002021-09-25 11:00:30111NULL
410120022021-09-26 11:00:002021-09-26 11:00:30101NULL
510120022021-09-27 11:00:002021-09-27 11:00:30110NULL
610220022021-09-28 11:00:002021-09-28 11:00:30101NULL
710320022021-09-29 11:00:002021-10-02 11:00:30101NULL
810220022021-09-30 11:00:002021-09-30 11:00:30111NULL
910120012021-10-01 10:00:002021-10-01 10:00:20110NULL
1010220012021-10-01 10:00:002021-10-01 10:00:15001NULL
1110320012021-10-01 11:00:502021-10-01 11:01:151101732526
1210620022021-10-02 10:59:052021-10-02 11:00:05201NULL
1310720022021-10-02 10:59:052021-10-02 11:00:05101NULL
1410820022021-10-02 10:59:052021-10-02 11:00:05111NULL
1510920022021-10-03 10:59:052021-10-03 11:00:05010NULL

(uid-用户ID, video_id-视频ID, start_time-开始观看时间, end_time-结束观看时间, if_follow-是否关注, if_like-是否点赞, if_retweet-是否转发, comment_id-评论ID)

短视频信息表tb_video_info

(video_id-视频ID, author-创作者ID, tag-类别标签, duration-视频时长, release_time-发布时间)

问题:统计2021年国庆头3天每类视频每天的近一周总点赞量和一周内最大单天转发量,结果按视频类别降序、日期升序排序。假设数据库中数据足够多,至少每个类别下国庆头3天及之前一周的每天都有播放记录。

输出示例:

tagdtsum_like_cnt_7dmax_retweet_cnt_7d
旅游2021-10-0152
旅游2021-10-0253
旅游2021-10-0363

求解思路

题目需要解出国庆节前三天的7天内,我们不能先筛选国庆节前三天这个条件,不然就会丢失7天内的数据。因此正确的思路就是先截取出2021-10-03之前9天(7+3-1=9)时间范围内的数据。此时需要用到 datediff 函数 配合 date_format 函数。

datediff('2021-10-03', date_format(start_time, '%Y-%m-%d'))<9

之后在规定日期期限内计算每日点赞量和转发量

SELECT  
    tag,
    DATE_FORMAT(start_time,'%Y-%m-%d') as dt,
    SUM(if_like) as like_cnt, 
    SUM(if_retweet) as ret_cnt
FROM tb_user_video_log JOIN tb_video_info USING(video_id)
WHERE DATEDIFF('2021-10-03', DATE_FORMAT(start_time,'%Y-%m-%d'))<9
GROUP BY dt, tag;

到此我们会得到2021-09-25至2021-10-03每个类别每个日期的点赞量和转发量的数据。之后我们使用窗口函数来限定7日窗口以进行累计点赞量和最大转发量。注意,这里是6行,即当前行+6=7行。

select
    tag,
    dt,
    sum(like_cnt) over(partition by tag order by dt rows 6 preceding) as sum_like_cnt_7d,
    max(ret_cnt) over(partition by tag order by dt rows 6 preceding) as max_retweet_cnt_7d
from
(
    SELECT  
        tag,
        DATE_FORMAT(start_time,'%Y-%m-%d') as dt,
        SUM(if_like) as like_cnt, 
        SUM(if_retweet) as ret_cnt
    FROM tb_user_video_log JOIN tb_video_info USING(video_id)
    WHERE DATEDIFF('2021-10-03', DATE_FORMAT(start_time,'%Y-%m-%d'))<9
    GROUP BY dt, tag
) t1

之后在补充日期限定、类别日期排序等细节

select
    tag,
    dt,
    sum(like_cnt) over(partition by tag order by dt rows 6 preceding) as sum_like_cnt_7d,
    max(ret_cnt) over(partition by tag order by dt rows 6 preceding) as max_retweet_cnt_7d
from
(
    SELECT  
        tag,
        DATE_FORMAT(start_time,'%Y-%m-%d') as dt,
        SUM(if_like) as like_cnt, 
        SUM(if_retweet) as ret_cnt
    FROM tb_user_video_log JOIN tb_video_info USING(video_id)
    WHERE DATEDIFF('2021-10-03', DATE_FORMAT(start_time,'%Y-%m-%d'))<9
    GROUP BY dt, tag
) t1
where dt between '2021-10-01' and '2021-10-03'
order by tag desc, dt asc

done

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值