- 学习:知识的初次邂逅
- 复习:知识的温故知新
- 练习:知识的实践应用
目录
一,原题力扣链接
MySQL截图
Pandas截图
二,题干
表:
Drivers
+-------------+---------+ | Column Name | Type | +-------------+---------+ | driver_id | int | | join_date | date | +-------------+---------+ driver_id 是该表具有唯一值的列。 该表的每一行均包含驾驶员的 ID 以及他们加入 Hopper 公司的日期。
表:
Rides
+--------------+---------+ | Column Name | Type | +--------------+---------+ | ride_id | int | | user_id | int | | requested_at | date | +--------------+---------+ ride_id 是该表具有唯一值的列。 该表的每一行均包含行程 ID(ride_id),用户 ID(user_id) 以及该行程的日期(requested_at)。 该表中可能有一些不被接受的乘车请求。
表:
AcceptedRides
+---------------+---------+ | Column Name | Type | +---------------+---------+ | ride_id | int | | driver_id | int | | ride_distance | int | | ride_duration | int | +---------------+---------+ ride_id 是该表具有唯一值的列。 该表的每一行都包含已接受的行程信息。 表中的行程信息都在 "
Rides
" 表中存在。
编写一个解决方案,计算出从 2020 年 1 月至 3 月 至 2020 年 10 月至 12 月 的每三个月窗口的
average_ride_distance
和average_ride_duration
。并将average_ride_distance
和average_ride_duration
四舍五入至 小数点后两位 。
通过将三个月的总ride_distance
相加并除以3
来计算average_ride_distance
。average_ride_duration
的计算方法与此类似。
返回按month
升序排列的结果表,其中month
是起始月份的编号(一月为 1,二月为 2 ...)。查询结果格式如下示例所示。
示例 1:
输入: Drivers table: +-----------+------------+ | driver_id | join_date | +-----------+------------+ | 10 | 2019-12-10 | | 8 | 2020-1-13 | | 5 | 2020-2-16 | | 7 | 2020-3-8 | | 4 | 2020-5-17 | | 1 | 2020-10-24 | | 6 | 2021-1-5 | +-----------+------------+ Rides table: +---------+---------+--------------+ | ride_id | user_id | requested_at | +---------+---------+--------------+ | 6 | 75 | 2019-12-9 | | 1 | 54 | 2020-2-9 | | 10 | 63 | 2020-3-4 | | 19 | 39 | 2020-4-6 | | 3 | 41 | 2020-6-3 | | 13 | 52 | 2020-6-22 | | 7 | 69 | 2020-7-16 | | 17 | 70 | 2020-8-25 | | 20 | 81 | 2020-11-2 | | 5 | 57 | 2020-11-9 | | 2 | 42 | 2020-12-9 | | 11 | 68 | 2021-1-11 | | 15 | 32 | 2021-1-17 | | 12 | 11 | 2021-1-19 | | 14 | 18 | 2021-1-27 | +---------+---------+--------------+ AcceptedRides table: +---------+-----------+---------------+---------------+ | ride_id | driver_id | ride_distance | ride_duration | +---------+-----------+---------------+---------------+ | 10 | 10 | 63 | 38 | | 13 | 10 | 73 | 96 | | 7 | 8 | 100 | 28 | | 17 | 7 | 119 | 68 | | 20 | 1 | 121 | 92 | | 5 | 7 | 42 | 101 | | 2 | 4 | 6 | 38 | | 11 | 8 | 37 | 43 | | 15 | 8 | 108 | 82 | | 12 | 8 | 38 | 34 | | 14 | 1 | 90 | 74 | +---------+-----------+---------------+---------------+ 输出: +-------+-----------------------+-----------------------+ | month | average_ride_distance | average_ride_duration | +-------+-----------------------+-----------------------+ | 1 | 21.00 | 12.67 | | 2 | 21.00 | 12.67 | | 3 | 21.00 | 12.67 | | 4 | 24.33 | 32.00 | | 5 | 57.67 | 41.33 | | 6 | 97.33 | 64.00 | | 7 | 73.00 | 32.00 | | 8 | 39.67 | 22.67 | | 9 | 54.33 | 64.33 | | 10 | 56.33 | 77.00 | +-------+-----------------------+-----------------------+ 解释: 到1月底-->平均骑行距离=(0+0+63)/3=21,平均骑行持续时间=(0+0+38)/3=12.67 到2月底-->平均骑行距离=(0+63+0)/3=21,平均骑行持续时间=(0+38+0)/3=12.67 到3月底-->平均骑行距离=(63+0+0)/3=21,平均骑行持续时间=(38+0+0)/3=12.67 到4月底-->平均骑行距离=(0+0+73)/3=24.33,平均骑行持续时间=(0+0+96)/3=32.00 到5月底-->平均骑行距离=(0+73+100)/3=57.67,平均骑行持续时间=(0+96+28)/3=41.33 到6月底-->平均骑行距离=(73+100+119)/3=97.33,平均骑行持续时间=(96+28+68)/3=64.00 到7月底-->平均骑行距离=(100+119+0)/3=73.00,平均骑行持续时间=(28+68+0)/3=32.00 到8月底-->平均骑行距离=(119+0+0)/3=39.67,平均骑行持续时间=(68+0+0)/3=22.67 9月底-->平均骑行距离=(0+0+163)/3=54.33,平均骑行持续时间=(0+0+193)/3=64.33 到10月底-->平均骑行距离=(0+163+6)/3=56.33,平均骑行持续时间=(0+193+38)/3=77.00
三,建表语句
Create table If Not Exists Drivers (driver_id int, join_date date);
Create table If Not Exists Rides (ride_id int, user_id int, requested_at date);
Create table If Not Exists AcceptedRides (ride_id int, driver_id int, ride_distance int, ride_duration int);
Truncate table Drivers
insert into Drivers (driver_id, join_date) values ('10', '2019-12-10')
insert into Drivers (driver_id, join_date) values ('8', '2020-1-13')
insert into Drivers (driver_id, join_date) values ('5', '2020-2-16')
insert into Drivers (driver_id, join_date) values ('7', '2020-3-8')
insert into Drivers (driver_id, join_date) values ('4', '2020-5-17')
insert into Drivers (driver_id, join_date) values ('1', '2020-10-24')
insert into Drivers (driver_id, join_date) values ('6', '2021-1-5');
Truncate table Rides;
insert into Rides (ride_id, user_id, requested_at) values ('6', '75', '2019-12-9')
insert into Rides (ride_id, user_id, requested_at) values ('1', '54', '2020-2-9')
insert into Rides (ride_id, user_id, requested_at) values ('10', '63', '2020-3-4')
insert into Rides (ride_id, user_id, requested_at) values ('19', '39', '2020-4-6')
insert into Rides (ride_id, user_id, requested_at) values ('3', '41', '2020-6-3')
insert into Rides (ride_id, user_id, requested_at) values ('13', '52', '2020-6-22')
insert into Rides (ride_id, user_id, requested_at) values ('7', '69', '2020-7-16')
insert into Rides (ride_id, user_id, requested_at) values ('17', '70', '2020-8-25')
insert into Rides (ride_id, user_id, requested_at) values ('20', '81', '2020-11-2')
insert into Rides (ride_id, user_id, requested_at) values ('5', '57', '2020-11-9')
insert into Rides (ride_id, user_id, requested_at) values ('2', '42', '2020-12-9')
insert into Rides (ride_id, user_id, requested_at) values ('11', '68', '2021-1-11')
insert into Rides (ride_id, user_id, requested_at) values ('15', '32', '2021-1-17')
insert into Rides (ride_id, user_id, requested_at) values ('12', '11', '2021-1-19')
insert into Rides (ride_id, user_id, requested_at) values ('14', '18', '2021-1-27');
Truncate table AcceptedRides;
insert into AcceptedRides (ride_id, driver_id, ride_distance, ride_duration) values ('10', '10', '63', '38')
insert into AcceptedRides (ride_id, driver_id, ride_distance, ride_duration) values ('13', '10', '73', '96')
insert into AcceptedRides (ride_id, driver_id, ride_distance, ride_duration) values ('7', '8', '100', '28')
insert into AcceptedRides (ride_id, driver_id, ride_distance, ride_duration) values ('17', '7', '119', '68')
insert into AcceptedRides (ride_id, driver_id, ride_distance, ride_duration) values ('20', '1', '121', '92')
insert into AcceptedRides (ride_id, driver_id, ride_distance, ride_duration) values ('5', '7', '42', '101')
insert into AcceptedRides (ride_id, driver_id, ride_distance, ride_duration) values ('2', '4', '6', '38')
insert into AcceptedRides (ride_id, driver_id, ride_distance, ride_duration) values ('11', '8', '37', '43')
insert into AcceptedRides (ride_id, driver_id, ride_distance, ride_duration) values ('15', '8', '108', '82')
insert into AcceptedRides (ride_id, driver_id, ride_distance, ride_duration) values ('12', '8', '38', '34')
insert into AcceptedRides (ride_id, driver_id, ride_distance, ride_duration) values ('14', '1', '90', '74');
四,分析
- 第一步,递归求出12个月份
- 第二步:拿到2020年的数据
- 第三步:内连接,2020年的订单数据Rides和AcceptedRides
- 第四步:用第一步的递归表和第三步的表左连接;null更改为0;
- 第五步:用开窗range 求范围累加和
- 详细表格逻辑如下:
代码实现过程:
第一步,递归求出12个月份
第二步:拿到2020年的数据
第三步:内连接,2020年的订单数据Rides和AcceptedRides
第四步:用第一步的递归表和第三步的表左连接;null更改为0;
第五步:用开窗range 求范围累加和
五,SQL解答
with recursive t1 as (
#递归拿到 1- 12月份
select 1 as month
union
select month+1 from t1 where month <12
)
# select * from t1;
,t2 as (
#拿到2020年的
select * from rides where year(requested_at) = '2020'
)
# select * from t2;
,t3 as (
select
requested_at,ride_distance,ride_duration,month(requested_at) as month
from t2 join AcceptedRides a on t2.ride_id=a.ride_id
)
# select * from t3;
,t4 as (
select
t1.month,
ifnull(sum(ride_distance),0) as sum_d1
,ifnull(sum(ride_duration),0) as sum_d2
from t1 left join t3 on t1.month=t3.month group by t1.month order by t1.month
)
# select * from t4;
,t5 as (
select
month,
round( (sum(sum_d1)over(order by month rows between current row and 2 following)) /3,2) as average_ride_distance,
round( (sum(sum_d2)over(order by month rows between current row and 2 following)) /3,2) as average_ride_duration
from t4
)
select month, average_ride_distance, average_ride_duration from t5 where month<=10;
六,验证
七,知识点总结
- SQL递归训练
- 条件过滤的运用
- 内连接的运用
- 时间函数的运用
- 左连接运用
- 分组聚合的运用
- sum开窗 range范围求累加和的运用 关键字 选择窗口范围 当前到往下2行;
- 四舍五入的运用
- 学习:知识的初次邂逅
- 复习:知识的温故知新
- 练习:知识的实践应用