某音SQL实战

问题:计算2021年里有播放记录的每个视频的完播率(结果保留三位小数),并按完播率降序排序

各个视频的平均完播率_牛客题霸_牛客网

SELECT
	a.video_id,
	round(sum(if(a.end_time-a.start_time>=b.duration,1,0))/count(a.video_id),3) as wbl
FROM
	tb_user_video_log a
JOIN tb_video_info b ON a.video_id = b.video_id
WHERE	YEAR(a.start_time) = 2021
GROUP BY
	a.video_id
ORDER BY
	wbl DESC;

问题:计算各类视频的平均播放进度,将进度大于60%的类别输出。

平均播放进度大于60%的视频类别_牛客题霸_牛客网

  • 关联用户-视频互动记录和短视频信息表:JOIN tb_video_info USING(video_id);
  • 按视频类别分组:GROUP BY tag
  • 计算每个类别的平均播放进度:
    • 播放进度=播放时长÷视频时长*100%
    • 播放时长=TIMESTAMPDIFF(SECOND, start_time, end_time);特殊情况:播放时长大于视频时长时,播放进度为100%(加个IF判断)
    • 平均进度=AVG(每个进度)
    • 结果保留2位小数:ROUND(x, 2)
    • 百分比格式化:CONCAT(x, '%')
    • 计算公式:

复制代码

1

2

3

4

ROUND(AVG(

    IF(TIMESTAMPDIFF(SECOND, start_time, end_time) > duration, 1,

       TIMESTAMPDIFF(SECOND, start_time, end_time) / duration)

) * 1002) as avg_play_progress

  • 筛选播放进度>60%的视频类别:HAVING avg_play_progress > 60

注意坑:

1.一定要先*100再保留两位小数
2.时间相减直接用减号【a.end_time-a.start_time】有效性差,容易出错,最好用函数,如timestampdif(天(DAY)、小时(HOUR),分钟(MINUTE)和秒(SECOND),早的时间点,晚的时间点)
3.要加一个if 
    if(
        timestampdiff(second, a.start_time, a.end_time) > duration,
        1,
        timestampdiff(second, a.start_time, a.end_time) / duration
    )
    不能直接用 timestampdiff(second, a.start_time, a.end_time) / duration,容易失真

SELECT
  c.tag,
  CONCAT(c.avg_dura, "%")
FROM
  (
    SELECT
      b.tag,
      round(
        avg(
          if(
            timestampdiff(second, a.start_time, a.end_time) > duration,
            1,
            timestampdiff(second, a.start_time, a.end_time) / duration
          )
        ) * 100,
        2
      ) as avg_dura
    FROM
      tb_user_video_log a
      join tb_video_info b on a.video_id = b.video_id
    group by
      b.tag
    HAVING
      avg_dura > 60
    order by
      avg_dura desc
  ) c;

问题:SQL158 每类视频近一个月的转发量/率

每类视频近一个月的转发量/率_牛客题霸_牛客网

问题:统计在有用户互动的最近一个月(按包含当天在内的近30天算,比如10月31日的近30天为10.2~10.31之间的数据)中,每类视频的转发量和转发率(保留3位小数)。

:转发率=转发量÷播放量。结果按转发率降序排序。

分析:

统计在有用户互动的最近一个月:

where DATEDIFF( DATE( ( SELECT MAX(start_time)  FROM  tb_user_video_log)), DATE(a.start_time)) <= 29

每类视频:group by tag 

转发量:SUM(if_retweet)  as retweet_cut

转发率:SUM(if_retweet) /count(*)  as retweet_rate

保留3位小数:round( retweet_rate,3) as retweet_rate_round

结果按转发率降序排序: order by retweet_rate_round desc

SELECT
  b.tag,
  sum(if_retweet) AS retweet_cut,
  round(sum(if_retweet) / count(*), 3) AS retweet_rate
FROM
  tb_user_video_log a
  LEFT JOIN tb_video_info b ON a.video_id = b.video_id
WHERE
  DATEDIFF(
    DATE(
      (
        SELECT
          MAX(start_time)
        FROM
          tb_user_video_log
      )
    ),
    DATE(a.start_time)
  ) <= 29
GROUP BY
  b.tag
ORDER BY
  retweet_rate DESC

问题:SQL159 每个创作者每月的涨粉率及截止当前的总粉丝量

SELECT
    author,
    date_format(start_time,'%Y-%m') month,
    round(sum(case when if_follow=1 then 1
             when if_follow=2 then -1
             else 0 end)/count(author),3) fans_growth_rate,
     sum(sum(case when if_follow=1 then 1
             when if_follow=2 then -1
             else 0 end)) over(partition by author order by date_format(start_time,'%Y-%m')) total_fans
FROM tb_user_video_log log
left join tb_video_info info on log.video_id=info.video_id
where year(start_time)=2021
group by author,month
order by author,total_fans 

SQL 数据分组累加sum() over (partition by ... order by ...)

问题:SQL160 国庆期间每类视频点赞量和转发量

问题:统计2021年国庆头3天每类视频每天的近一周总点赞量和一周内最大单天转发量,结果按视频类别降序、日期升序排序。假设数据库中数据足够多,至少每个类别下国庆头3天及之前一周的每天都有播放记录。

国庆期间每类视频点赞量和转发量_牛客题霸_牛客网

题解:

1.2021年国庆头3天:

where DATE_FORMAT(start_time, "%Y-%m-%d")  BETWEEN '2021-10-01' and '2021-10-03'

 2.每类视频每天的近一周总点赞量:

SUM(sum(if_like)) over (PARTITION BY tag ORDER BY DATE_FORMAT(start_time, "%Y-%m-%d") rows 6 preceding) as sum_like_7day 

3.每类视频每天的近一周内最大转发量:

 MAX(sum(if_retweet)) over (PARTITION BY tag ORDER BY DATE_FORMAT(start_time, "%Y-%m-%d") rows 6 preceding) as max_retweet_7day

  4.每类视频每天

GROUP BY tag, dt

WITH t1 AS (
SELECT
	tag,
	DATE_FORMAT(start_time, '%Y-%m-%d') dt,
	SUM(SUM(if_like))OVER(PARTITION BY tag ORDER BY DATE_FORMAT(start_time,'%Y-%m-%d') rows 6 preceding) AS sum_like_cnt_7d	,
	MAX(SUM(if_retweet))OVER(PARTITION BY tag ORDER BY DATE_FORMAT(start_time,'%Y-%m-%d') rows 6 preceding) AS 	max_retweet_cnt_7d
FROM tb_user_video_log JOIN tb_video_info USING(video_id)
WHERE DATEDIFF('2021-10-03', DATE_FORMAT(start_time, '%Y-%m-%d'))<9
GROUP BY dt, tag
)
SELECT * FROM t1 WHERE dt BETWEEN '2021-10-01' AND '2021-10-03' ORDER BY tag DESC,dt ASC;

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值