mysql8窗口函数学习笔记

  • 定义
    • 窗口函数(Window Functions)的命名源于其核心机制:​通过定义一个动态的“窗口”(Window)来确定函数计算的数据范围。这个“窗口”并非固定不变,而是为每一行数据单独划定一个相关的数据集,函数仅在该范围内执行计算。
    • 与聚合函数(如 GROUP BY)不同,窗口函数不会折叠数据,而是为每行返回结果,同时保留原始行。
  • 案例
    • 环境准备
      • 创建一张成绩表
        CREATE TABLE `academic_records` (
          `id` int NOT NULL AUTO_INCREMENT COMMENT '自增主键',
          `student_id` varchar(10) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NOT NULL COMMENT '学生唯一标识符',
          `student_name` varchar(50) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NOT NULL COMMENT '学生姓名',
          `subject` varchar(20) CHARACTER SET utf8mb4 COLLATE utf8mb4_general_ci NOT NULL COMMENT '学科名称',
          `exam_date` datetime NOT NULL COMMENT '考试时间',
          `score` decimal(5,1) NOT NULL COMMENT '数值成绩(0-100分,允许一位小数)',
          PRIMARY KEY (`id`) USING BTREE
        ) ENGINE=InnoDB AUTO_INCREMENT=128 DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_general_ci ROW_FORMAT=DYNAMIC COMMENT='学生学业成绩记录表';
        
      • 填充数据
      INSERT INTO academic_records (student_id, student_name, subject, exam_date, score)
      VALUES
      -- 学生 S001 张三
      ('S001', '张三', '语文', '2023-10-01 09:00:00', 85.5),
      ('S001', '张三', '语文', '2023-11-05 09:00:00', 88.0),
      ('S001', '张三', '语文', '2023-12-10 09:00:00', 90.5),
      ('S001', '张三', '数学', '2023-10-01 14:00:00', 92.0),
      ('S001', '张三', '数学', '2023-11-05 14:00:00', 94.5),
      ('S001', '张三', '数学', '2023-12-10 14:00:00', 95.0),
      ('S001', '张三', '英语', '2023-10-01 10:00:00', 80.0),
      ('S001', '张三', '英语', '2023-11-05 10:00:00', 85.5),
      ('S001', '张三', '英语', '2023-12-10 10:00:00', 88.0),
      
      -- 学生 S002 李四
      ('S002', '李四', '语文', '2023-10-01 09:00:00', 78.5),
      ('S002', '李四', '语文', '2023-11-05 09:00:00', 82.0),
      ('S002', '李四', '语文', '2023-12-10 09:00:00', 85.5),
      ('S002', '李四', '数学', '2023-10-01 14:00:00', 88.0),
      ('S002', '李四', '数学', '2023-11-05 14:00:00', 91.5),
      ('S002', '李四', '数学', '2023-12-10 14:00:00', 93.0),
      ('S002', '李四', '英语', '2023-10-01 10:00:00', 92.5),
      ('S002', '李四', '英语', '2023-11-05 10:00:00', 89.0),
      ('S002', '李四', '英语', '2023-12-10 10:00:00', 95.5),
      
      -- 学生 S003 王五
      ('S003', '王五', '语文', '2023-10-01 09:00:00', 65.0),
      ('S003', '王五', '语文', '2023-11-05 09:00:00', 70.5),
      ('S003', '王五', '语文', '2023-12-10 09:00:00', 75.0),
      ('S003', '王五', '数学', '2023-10-01 14:00:00', 95.0),
      ('S003', '王五', '数学', '2023-11-05 14:00:00', 97.5),
      ('S003', '王五', '数学', '2023-12-10 14:00:00', 99.0),
      ('S003', '王五', '英语', '2023-10-01 10:00:00', 72.0),
      ('S003', '王五', '英语', '2023-11-05 10:00:00', 68.5),
      ('S003', '王五', '英语', '2023-12-10 10:00:00', 80.0),
      
      -- 学生 S004 赵六
      ('S004', '赵六', '语文', '2023-10-01 09:00:00', 92.5),
      ('S004', '赵六', '语文', '2023-11-05 09:00:00', 94.0),
      ('S004', '赵六', '语文', '2023-12-10 09:00:00', 96.5),
      ('S004', '赵六', '数学', '2023-10-01 14:00:00', 85.0),
      ('S004', '赵六', '数学', '2023-11-05 14:00:00', 88.5),
      ('S004', '赵六', '数学', '2023-12-10 14:00:00', 90.0),
      ('S004', '赵六', '英语', '2023-10-01 10:00:00', 89.5),
      ('S004', '赵六', '英语', '2023-11-05 10:00:00', 93.0),
      ('S004', '赵六', '英语', '2023-12-10 10:00:00', 91.5),
      
      -- 学生 S005 陈七
      ('S005', '陈七', '语文', '2023-10-01 09:00:00', 60.5),
      ('S005', '陈七', '语文', '2023-11-05 09:00:00', 65.0),
      ('S005', '陈七', '语文', '2023-12-10 09:00:00', 68.5),
      ('S005', '陈七', '数学', '2023-10-01 14:00:00', 72.0),
      ('S005', '陈七', '数学', '2023-11-05 14:00:00', 75.5),
      ('S005', '陈七', '数学', '2023-12-10 14:00:00', 79.0),
      ('S005', '陈七', '英语', '2023-10-01 10:00:00', 84.0),
      ('S005', '陈七', '英语', '2023-11-05 10:00:00', 88.5),
      ('S005', '陈七', '英语', '2023-12-10 10:00:00', 92.0);
      
    • 函数语法
      函数名([参数]) OVER (
      	[PARTITION BY 分组列]
      	[ORDER BY 排序列 [ASC|DESC]]
      	[ROWS|RANGE 框架范围]
      )
      
      • PARTITION BY: 按列分组,类似GROUP BY但结果不折叠
      • ORDER BY: 定义窗口内排序方式,影响函数计算逻辑
      • 框架范围(Frame Clause)​: 指定窗口的滑动范围,例如:
        • ROWS BETWEEN 3 PRECEDING AND CURRENT ROW: 包含当前行及前3行。
        • RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW: 默认范围(从第一行到当前行)
        • ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING : 覆盖整个分区
    • 函数应用
      • 排名函数

        • ROW_NUMBER(): 为每行分配唯一序号(如:1,2,3)。
        • RANK(): 相同值排名相同,后续序号跳过(如:1,1,3)。
        • DENSE_RANK(): 相同值排名相同,后续序号不跳过(如:1,1,2)
        • demo:
          SELECT 
          	student_id,
          	student_name,
          	subject,
          		exam_date,
          	score,
          	-- 唯一序号(无并列)
          	ROW_NUMBER() OVER (PARTITION BY subject ORDER BY score DESC) AS row_num_rank,
          	-- 允许并列且跳过后续序号
          	RANK() OVER (PARTITION BY subject ORDER BY score DESC) AS `rank`,  -- 用反引号包裹
          	-- 允许并列且不跳过序号
          	DENSE_RANK() OVER (PARTITION BY subject ORDER BY score DESC) AS `dense_rank`
          FROM academic_records WHERE subject = '数学';
          
          • 在这里插入图片描述
      • 聚合函数作为窗口函数

        • SUM(), AVG(), COUNT(), MIN(), MAX() 等可配合窗口使用,实现累积或分组统计
        • demo
          	SELECT 
          	    student_id,
          	    student_name,
          	    subject,
          	    exam_date,
          	    score,
          	    -- 按考试顺序(id)计算累积平均分
          	    AVG(score) OVER (
          	        PARTITION BY student_id, subject 
          	        ORDER BY id 
          	        RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
          	    ) AS cumulative_avg,
          	    -- 按考试顺序(id)计算近2次考试(包括本次)的平均分
          	    AVG(score) OVER (
          	        PARTITION BY student_id, subject 
          	        ORDER BY id 
          	        ROWS BETWEEN 1 PRECEDING AND CURRENT ROW
          	    ) AS cumulative_avg_last_2_months
          	FROM academic_records WHERE student_id = 'S001'; 
          
          • 在这里插入图片描述
      • 分布函数

        • CUME_DISTO :计算累积分布(<=当前值的行占比,范围[0,1])。
        • demo
          SELECT 
              student_id,
              student_name,
              score,
              CUME_DIST() OVER (ORDER BY score DESC) AS `cume_dist`
          FROM academic_records
          WHERE subject = '数学' and exam_date = '2023-10-01 14:00:00';
          
          • 在这里插入图片描述
      • 前后行访问

        • LAG(column,n):获取当前行前第 n行的值。
        • LEAD(column,n):获取当前行后第 n行的值。
        • demo
          SELECT 
              student_id,
              student_name,
              subject,
              exam_date,
              score,
              -- 获取上一次考试成绩
              LAG(score) OVER (
                  PARTITION BY student_id, subject 
                  ORDER BY exam_date
              ) AS previous_score,
              -- 计算本次与上次成绩差值
              score - LAG(score) OVER (
                  PARTITION BY student_id, subject 
                  ORDER BY exam_date
              ) AS score_change
          FROM academic_records where student_id = 'S001';
          
          • ![[1742177183145.png]]
      • 首尾值函数

        • FIRST_VALUE(column):返回窗口第一行的值。
        • LAST VALUE(column):返回窗口最后一行的值。
        • demo
          SELECT 
              student_id,
              student_name,
              subject,
              exam_date,
              score,
              -- 首次考试成绩
              FIRST_VALUE(score) OVER (
                  PARTITION BY student_id, subject
                  ORDER BY exam_date
                  ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING  -- 覆盖整个分区
              ) AS first_score,
              -- 最后一次考试成绩
              LAST_VALUE(score) OVER (
                  PARTITION BY student_id, subject
                  ORDER BY exam_date
                  ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING  -- 覆盖整个分区
              ) AS last_score
          FROM academic_records
          where student_id = 'S001'
          ORDER BY student_id, subject, exam_date;
          
          • ![[1742177688590.png
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值