SQL笔试题目-----------常考易错题(01)
题:求连续登陆天数最大的用户id和其天数
1.建表语句:
-- Table structure for dau
-- ----------------------------
DROP TABLE IF EXISTS `dau`;
CREATE TABLE `dau` (
`dt` date NULL DEFAULT NULL,
`uid` int(0) NULL DEFAULT NULL,
`module` varchar(255) CHARACTER SET utf8mb4 COLLATE utf8mb4_0900_ai_ci NULL DEFAULT NULL,
`act_time` int(0) NULL DEFAULT NULL
) ENGINE = InnoDB CHARACTER SET = utf8 COLLATE = utf8_general_ci ROW_FORMAT = Dynamic;
-- ----------------------------
-- Records of dau
-- ----------------------------
INSERT INTO `dau` VALUES ('2021-02-01', 1, 'feeds', 234);
INSERT INTO `dau` VALUES ('2021-02-01', 1, 'im', 332);
INSERT INTO `dau` VALUES ('2021-02-03', 1, 'jobs', 380);
INSERT INTO `dau` VALUES ('2021-04-05', 1, 'jobs', 102);
INSERT INTO `dau` VALUES ('2021-04-06', 1, 'jobs', 340);
INSERT INTO `dau` VALUES ('2021-05-12', 2, 'feeds', 234);
INSERT INTO `dau` VALUES ('2021-05-13', 2, 'feeds', 45);
INSERT INTO `dau` VALUES ('2021-05-19', 2, 'im', 450);
INSERT INTO `dau` VALUES ('2021-07-20', 2, 'jobs', 468);
INSERT INTO `dau` VALUES ('2021-03-04', 3, 'feeds', 665);
INSERT INTO `dau` VALUES ('2021-03-05', 3, 'feeds', 800);
INSERT INTO `dau` VALUES ('2021-03-06', 3, 'jobs', 451);
INSERT INTO `dau` VALUES ('2021-02-01', 4, 'im', 50);
结构如图:
2.题:得到连续登陆天数最大的用户id和其天数
setp1:首先得到每个用户的登陆天数和其各自顺序
SELECT DISTINCT uid, dt,DENSE_RANK()OVER (PARTITION BY uid ORDER BY dt ASC) AS "d_rank"
FROM dau
-- WHERE dt BETWEEN DATE_SUB(CURRENT_DATE,INTERVAL 365 DAY) AND DATE_SUB(CURRENT_DATE,INTERVAL 1 DAY)
有的题目若需要时间限制,可参考注释部分(过去一年截至昨日的数据)
setp2:用登陆日期减去step1得到的d_rank,命名为diff字段
SELECT a.uid,a.dt,a.d_rank,DATE_SUB(a.dt,INTERVAL a.d_rank DAY) AS "diff"
FROM (
SELECT DISTINCT uid, dt,DENSE_RANK()OVER (PARTITION BY uid ORDER BY dt ASC) AS "d_rank"
FROM dau
WHERE dt BETWEEN DATE_SUB(CURRENT_DATE,INTERVAL 365 DAY) AND DATE_SUB(CURRENT_DATE,INTERVAL 1 DAY)) a
这里因为是日期减去排名,得到的diff字段中,若登陆日期是连续日期,则它们的差,即diff字段是相同的(如图红框部分)
step3:得到连续登陆天数最大的用户
SELECT c.uid,MAX(c.c_day) AS "max_c_day"
FROM
(SELECT b.uid,b.dt,COUNT(0) AS "c_day"
FROM(
SELECT a.uid,a.dt,a.d_rank,DATE_SUB(a.dt,INTERVAL a.d_rank DAY) AS "diff"
FROM (
SELECT DISTINCT uid, dt,DENSE_RANK()OVER (PARTITION BY uid ORDER BY dt ASC) AS "d_rank"
FROM dau
WHERE dt BETWEEN DATE_SUB(CURRENT_DATE,INTERVAL 365 DAY) AND DATE_SUB(CURRENT_DATE,INTERVAL 1 DAY)) a
-- a
)b
GROUP BY b.uid,b.diff)c
最终结果:
3.总结
这里用的窗口函数,比较巧妙地使用rank函数和做差的方法判断出连续登陆的天数.所以此类题型的关键在于对窗口函数的使用
若想了解窗口函数rank或者row_number函数可参考此链接
窗口函数
PS:此题型为大厂笔试常考题型
难度:★★☆
频率:★★★