1.找出两组都查看的用户
有以下表t1, 取出group 1 ,2 都view 的用户id
user_id | group | if_view |
111 | 1 | 0 |
113 | 1 | 1 |
114 | 1 | 1 |
111 | 2 | 1 |
114 | 2 | 0 |
113 | 2 | 1 |
解法1:使用GROUP BY
和HAVING
子句筛选
SELECT user_id
FROM t1
WHERE user_id IN (
SELECT user_id
FROM t1
WHERE group IN (1, 2) AND if_view = 1
GROUP BY user_id
HAVING COUNT(DISTINCT group) = 2
)
-
首先在子查询中筛选出group为1或2且if_view为1的用户。
-
使用
GROUP BY
和HAVING
子句来筛选出同时在group 1和2都有view的用户。
解法2 :使用join实现表的自连接
SELECT DISTINCT t1.user_id
FROM t1
JOIN t1 as t2 ON t1.user_id = t2.user_id
WHERE t1.group = 1 AND t1.if_view = 1
AND t2.group = 2 AND t2.if_view = 1
ORDER BY t1.user_id, t1.group;
连接了两次同一张表,一次是group为1的记录,另一次是group为2的记录。然后使用DISTINCT
去重
解法3:使用exist子查询
SELECT user_id
FROM t1
WHERE t1.group = 1 AND t1.if_view = 1
AND EXISTS (
SELECT 1
FROM t1 as t2
WHERE t2.user_id = t1.user_id AND t2.group = 2 AND t2.if_view = 1
)
使用EXISTS
子查询来检查是否存在另一个满足条件的记录(group为2且if_view为1)与当前记录的user_id相同。如果存在,那么当前记录也会被选中。
2.最大连续登录天数
计算每个用户最大连续上线天数
id | date_time |
100001 | 2022/1/1 |
100001 | 2022/1/2 |
100001 | 2022/1/3 |
100001 | 2022/1/4 |
100001 | 2022/1/5 |
100001 | 2022/1/6 |
100002 | 2022/1/2 |
100002 | 2022/1/3 |
100002 | 2022/1/5 |
100002 | 2022/1/6 |
100002 | 2022/1/7 |
总结:连续问题都可以通过做差法处理-构造一个排序列
思路:首先distinct日期 之后再用row_number()排序,构造新列rn,
date与rn相减,记为group_diff。如果日期是连续的,那么两行group_diff应该是一样的,
SELECT
id,
MAX(consecutive_days) AS max_consecutive_days
FROM (
SELECT
id,
date_name,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY date_name ) as rn
date_name - ROW_NUMBER() OVER ( PARTITION BY id ORDER BY date_name) AS group_diff,
COUNT(*) AS consecutive_days
FROM (
SELECT DISTINCT
id,date_name
FROM t1
) AS t2
GROUP BY id, group_diff
) AS t3
GROUP BY id;
注:没有实际运行,不确定是否有编码错误