题目
牛客网——SQL179:各城市最大同时等车人数
题目描述
问题:统计各个城市在2021年10月期间,单日中最大的同时等车人数。
数据:
用户打车记录表tb_get_car_record
id | uid | city | event_time | end_time | order_id |
1 | 108 | 北京 | 2021-10-20 08:00:00 | 2021-10-20 08:00:40 | 9008 |
2 | 118 | 北京 | 2021-10-20 08:00:10 | 2021-10-20 08:00:45 | 9018 |
3 | 102 | 北京 | 2021-10-20 08:00:30 | 2021-10-20 08:00:50 | 9002 |
4 | 106 | 北京 | 2021-10-20 08:05:41 | 2021-10-20 08:06:00 | 9006 |
5 | 103 | 北京 | 2021-10-20 08:05:50 | 2021-10-20 08:07:10 | 9003 |
6 | 104 | 北京 | 2021-10-20 08:01:01 | 2021-10-20 08:01:20 | 9004 |
7 | 105 | 北京 | 2021-10-20 08:01:15 | 2021-10-20 08:01:30 | 9019 |
8 | 101 | 北京 | 2021-10-20 08:28:10 | 2021-10-20 08:30:00 | 9011 |
打车订单表tb_get_car_order
id | order_id | uid | driver_id | order_time | start_time | finish_time | mileage | fare | grade |
1 | 9008 | 108 | 204 | 2021-10-20 08:00:40 | 2021-10-20 08:03:00 | 2021-10-20 08:31:00 | 13.2 | 38 | 4 |
2 | 9018 | 108 | 214 | 2021-10-20 08:00:45 | 2021-10-20 08:04:50 | 2021-10-20 08:21:00 | 14 | 38 | 5 |
3 | 9002 | 102 | 202 | 2021-10-20 08:00:50 | 2021-10-20 08:06:00 | 2021-10-20 08:31:00 | 10 | 41.5 | 5 |
4 | 9006 | 106 | 206 | 2021-10-20 08:06:00 | 2021-10-20 08:09:00 | 2021-10-20 08:31:00 | 8 | 25.5 | 4 |
5 | 9003 | 103 | 203 | 2021-10-20 08:07:10 | 2021-10-20 08:15:00 | 2021-10-20 08:31:00 | 11 | 41.5 | 4 |
6 | 9004 | 104 | 204 | 2021-10-20 08:01:20 | 2021-10-20 08:13:00 | 2021-10-20 08:31:00 | 7.5 | 22 | 4 |
7 | 9019 | 105 | 205 | 2021-10-20 08:01:30 | 2021-10-20 08:11:00 | 2021-10-20 08:51:00 | 10 | 39 | 4 |
8 | 9011 | 101 | 211 | 2021-10-20 08:30:00 | 2021-10-20 08:31:00 | 2021-10-20 08:54:00 | 10 | 35 | 5 |
求解思路
这是一道典型的求解同时在线用户数量的题目,与“统计直播间在线用户数”的逻辑一致。
- 瞬时UV的计算方法:进入事件记作UV=1,离开事件记作UV=-1,使用窗口函数SUM(uv)OVER(ORDER BY 事件,uv DESC)进行累加
首先,需要对用户进入和离开的形式进行定义。
进入时间:event_time,开始打车的时间即为等车开始。
离开时间:有3种情况
- 状态1:司机接单前取消,则没有生成order_id,这种情况 order_id IS NULL 记录end_time
- 状态2:司机接单后取消,则没有上车时间,start_time IS NULL 记录 finish_time
- 状态3:正常上车,记录start_time,start_time IS NOT NULL
- 状态2和3可以直接使用IFNULL()合并,IFNULL(start_time,finish_time) 如果start_time空则返回finish_time,不空则start_time
定义完用户进入等车和离开等车这两种事件之后,关联所有表格,使用窗口函数排序累加即可。
现求出这三种情况的数据如下:
- uv为1,用户进入打车状态
SELECT city,event_time uv_time,1 AS uv FROM tb_get_car_record
- uv为-1,状态1:司机接单前取消,则没有生成order_id,这种情况 order_id IS NULL 记录end_time
SELECT city,end_time uv_time,-1 AS uv FROM tb_get_car_record WHERE order_id IS NULL #接单前取消
- uv为-1,状态2:接单后取消或者用户正常上车
SELECT city,IFNULL(start_time,finish_time) uv_time,-1 AS uv FROM tb_get_car_order LEFT JOIN tb_get_car_record USING(order_id) #接单后取消或上车
- 建立子表:使用union all 全关联
SELECT city,event_time uv_time,1 AS uv FROM tb_get_car_record
UNION ALL
SELECT city,end_time uv_time,-1 AS uv FROM tb_get_car_record WHERE order_id IS NULL #接单前取消
UNION ALL
SELECT city,IFNULL(start_time,finish_time) uv_time,-1 AS uv FROM tb_get_car_order LEFT JOIN tb_get_car_record USING(order_id)#接单后取消或上车
这里可以得到每个城市的uv_time和uv数值如下:
之后统计2021年10月,每个城市的瞬时UV情况,并取出每个城市最大的UV,排序先按照uv升序,uv一样按照城市升序
SELECT
city,
MAX(uv_cnt) max_wait_uv
FROM
(
SELECT
city,
SUM(uv)OVER(PARTITION BY city ORDER BY uv_time,uv DESC) AS uv_cnt
FROM
(
SELECT city, event_time uv_time,1 AS uv FROM tb_get_car_record
UNION ALL
SELECT city, end_time uv_time,-1 AS uv FROM tb_get_car_record WHERE order_id IS NULL
UNION ALL
SELECT city, IFNULL(start_time,finish_time) uv_time,-1 AS uv FROM tb_get_car_order LEFT JOIN tb_get_car_record USING(order_id)
) t1
WHERE DATE_FORMAT(uv_time,'%Y%m')='202110'
) t2
GROUP BY citY
ORDER BY max_wait_uv, citY
done
参考牛客网盐咸咸的题解