【MySQL】子查询

在这里插入图片描述

个人主页:Guiat
归属专栏:MySQL

在这里插入图片描述

正文

1. 子查询概述

子查询是嵌套在另一个查询内部的SELECT语句,它可以提供主查询所需的数据或条件。子查询使SQL语句更加灵活和强大,能够处理复杂的数据检索和分析需求。

1.1 子查询的作用

  • 分解复杂查询逻辑
  • 执行多步骤数据筛选
  • 进行复杂计算和比较
  • 处理动态条件和关联数据
  • 实现高级数据分析功能

1.2 MySQL中的子查询类型

MySQL支持多种类型的子查询,可以根据需求灵活使用:

MySQL子查询
标量子查询
列子查询
行子查询
表子查询
相关子查询
非相关子查询

1.3 子查询的位置

子查询可以出现在SQL语句的多个位置:

子查询位置
SELECT子句
FROM子句
WHERE子句
HAVING子句
ORDER BY子句
INSERT语句
UPDATE语句
DELETE语句

2. 标量子查询

2.1 标量子查询的特点

  • 返回单个值(一行一列)
  • 可以用在表达式可以使用的任何地方
  • 常用于比较操作(=, <>, >, <, >=, <=)
  • 可以与聚合函数结合使用

2.2 WHERE子句中的标量子查询

-- 查找价格高于平均价格的产品
SELECT product_name, price
FROM products
WHERE price > (
    SELECT AVG(price)
    FROM products
);

-- 查找最新订单的详细信息
SELECT *
FROM orders
WHERE order_date = (
    SELECT MAX(order_date)
    FROM orders
);

2.3 SELECT子句中的标量子查询

-- 在结果中包含每种产品的平均价格
SELECT 
    category_name,
    (SELECT AVG(price) FROM products WHERE category_id = c.category_id) AS avg_price
FROM categories c;

-- 显示每位员工与公司平均工资的差额
SELECT 
    employee_id,
    first_name,
    last_name,
    salary,
    salary - (SELECT AVG(salary) FROM employees) AS diff_from_avg
FROM employees;

2.4 HAVING子句中的标量子查询

-- 查找销售额高于平均销售额的部门
SELECT 
    department_id,
    SUM(sales_amount) AS total_sales
FROM sales
GROUP BY department_id
HAVING SUM(sales_amount) > (
    SELECT AVG(dept_sales)
    FROM (
        SELECT department_id, SUM(sales_amount) AS dept_sales
        FROM sales
        GROUP BY department_id
    ) AS dept_totals
);

3. 列子查询

3.1 列子查询的特点

  • 返回单列多行数据
  • 通常与IN, NOT IN, ANY, ALL等操作符一起使用
  • 用于多值比较或集合操作
  • 可以替代多个OR条件

3.2 IN和NOT IN操作符

-- 查找在纽约或洛杉矶的客户
SELECT customer_id, customer_name
FROM customers
WHERE city IN (
    SELECT city
    FROM city_list
    WHERE state IN ('NY', 'CA')
);

-- 查找没有订单的客户
SELECT customer_id, customer_name
FROM customers
WHERE customer_id NOT IN (
    SELECT DISTINCT customer_id
    FROM orders
);

3.3 ANY和ALL操作符

列子查询比较
= ANY
> ANY
< ANY
= ALL
> ALL
< ALL
等于列表中的任何值
大于列表中的最小值
小于列表中的最大值
等于列表中的所有值
大于列表中的最大值
小于列表中的最小值
-- 查找价格高于任何一个电子产品的服装
SELECT product_id, product_name, price
FROM products
WHERE 
    category_id = 2 AND
    price > ANY (
        SELECT price
        FROM products
        WHERE category_id = 1
    );

-- 查找价格高于所有电子产品的服装
SELECT product_id, product_name, price
FROM products
WHERE 
    category_id = 2 AND
    price > ALL (
        SELECT price
        FROM products
        WHERE category_id = 1
    );

3.4 EXISTS和NOT EXISTS操作符

-- 查找至少有一个订单的客户
SELECT customer_id, customer_name
FROM customers c
WHERE EXISTS (
    SELECT 1
    FROM orders o
    WHERE o.customer_id = c.customer_id
);

-- 查找没有订单的客户
SELECT customer_id, customer_name
FROM customers c
WHERE NOT EXISTS (
    SELECT 1
    FROM orders o
    WHERE o.customer_id = c.customer_id
);

4. 行子查询

4.1 行子查询的特点

  • 返回单行多列或多行多列
  • 可以同时比较多个值
  • 简化复杂的条件逻辑
  • MySQL 5.7+完全支持

4.2 单行多列子查询

-- 查找与特定员工在同一部门和同一职位的其他员工
SELECT employee_id, first_name, last_name
FROM employees
WHERE (department_id, job_id) = (
    SELECT department_id, job_id
    FROM employees
    WHERE employee_id = 103
) AND employee_id != 103;

-- 查找与最高薪资员工相同部门的员工
SELECT employee_id, first_name, last_name, salary
FROM employees
WHERE department_id = (
    SELECT department_id
    FROM employees
    WHERE salary = (SELECT MAX(salary) FROM employees)
);

4.3 多行多列子查询

-- 查找部门经理和部门
SELECT e.employee_id, e.first_name, e.last_name, d.department_name
FROM employees e
JOIN departments d ON e.department_id = d.department_id
WHERE (e.department_id, e.employee_id) IN (
    SELECT department_id, manager_id
    FROM departments
    WHERE manager_id IS NOT NULL
);

5. 表子查询

5.1 表子查询的特点

  • 返回完整的结果集(多行多列)
  • 通常在FROM子句中使用
  • 创建派生表或临时结果集
  • 可以简化复杂的查询逻辑

5.2 在FROM子句中使用表子查询

-- 查询每个部门的平均薪资排名
SELECT 
    d.department_name,
    dept_avg.avg_salary,
    RANK() OVER (ORDER BY dept_avg.avg_salary DESC) AS salary_rank
FROM departments d
JOIN (
    SELECT 
        department_id, 
        AVG(salary) AS avg_salary
    FROM employees
    GROUP BY department_id
) dept_avg ON d.department_id = dept_avg.department_id
ORDER BY salary_rank;

-- 查询每个季度的销售总额
SELECT 
    quarters.quarter_name,
    SUM(s.sales_amount) AS total_sales
FROM sales s
JOIN (
    SELECT 1 AS quarter_id, 'Q1' AS quarter_name, '2023-01-01' AS start_date, '2023-03-31' AS end_date
    UNION SELECT 2, 'Q2', '2023-04-01', '2023-06-30'
    UNION SELECT 3, 'Q3', '2023-07-01', '2023-09-30'
    UNION SELECT 4, 'Q4', '2023-10-01', '2023-12-31'
) quarters ON s.sale_date BETWEEN quarters.start_date AND quarters.end_date
GROUP BY quarters.quarter_id, quarters.quarter_name
ORDER BY quarters.quarter_id;

5.3 WITH子句(公用表表达式,CTE)

-- 使用WITH子句创建临时结果集
WITH dept_avg_salary AS (
    SELECT 
        department_id, 
        AVG(salary) AS avg_salary
    FROM employees
    GROUP BY department_id
)
SELECT 
    d.department_name,
    das.avg_salary
FROM departments d
JOIN dept_avg_salary das ON d.department_id = das.department_id
ORDER BY das.avg_salary DESC;

-- 使用递归CTE查找员工层级
WITH RECURSIVE emp_hierarchy AS (
    -- 锚成员: 顶级经理(假设顶级经理的manager_id为NULL)
    SELECT 
        employee_id, 
        first_name, 
        last_name, 
        manager_id, 
        1 AS level
    FROM employees
    WHERE manager_id IS NULL
    
    UNION ALL
    
    -- 递归成员: 下属员工
    SELECT 
        e.employee_id, 
        e.first_name, 
        e.last_name, 
        e.manager_id, 
        eh.level + 1 AS level
    FROM employees e
    JOIN emp_hierarchy eh ON e.manager_id = eh.employee_id
)
SELECT 
    employee_id,
    CONCAT(REPEAT('    ', level - 1), first_name, ' ', last_name) AS employee,
    level
FROM emp_hierarchy
ORDER BY level, first_name, last_name;

6. 相关子查询

6.1 相关子查询的特点

  • 引用外部查询的表或列
  • 每处理外部查询的一行,内部查询都会执行一次
  • 性能可能比非相关子查询慢
  • 能够解决特定类型的问题

6.2 相关子查询的执行流程

开始
获取外部查询的第一行
执行内部查询,使用当前外部行的值
根据内部查询结果处理外部当前行
外部查询还有更多行?
获取外部查询的下一行
结束

6.3 WHERE子句中的相关子查询

-- 查找每个部门薪资最高的员工
SELECT 
    e.employee_id,
    e.first_name,
    e.last_name,
    e.department_id,
    e.salary
FROM employees e
WHERE e.salary = (
    SELECT MAX(salary)
    FROM employees
    WHERE department_id = e.department_id
);

-- 查找高于部门平均薪资的员工
SELECT 
    e.employee_id,
    e.first_name,
    e.last_name,
    e.department_id,
    e.salary
FROM employees e
WHERE e.salary > (
    SELECT AVG(salary)
    FROM employees
    WHERE department_id = e.department_id
);

6.4 SELECT子句中的相关子查询

-- 查询每个员工的信息以及他们的经理名字
SELECT 
    e.employee_id,
    e.first_name,
    e.last_name,
    (SELECT CONCAT(m.first_name, ' ', m.last_name) 
     FROM employees m 
     WHERE m.employee_id = e.manager_id) AS manager_name
FROM employees e;

-- 显示每位员工与他所在部门平均薪资的差额
SELECT 
    e.employee_id,
    e.first_name,
    e.last_name,
    e.salary,
    (SELECT AVG(salary) FROM employees WHERE department_id = e.department_id) AS dept_avg,
    e.salary - (SELECT AVG(salary) FROM employees WHERE department_id = e.department_id) AS diff
FROM employees e;

6.5 EXISTS与相关子查询

-- 查找已处理过特定类型订单的员工
SELECT DISTINCT
    e.employee_id,
    e.first_name,
    e.last_name
FROM employees e
WHERE EXISTS (
    SELECT 1
    FROM orders o
    WHERE o.employee_id = e.employee_id
    AND o.order_type = 'Premium'
);

-- 查找在所有部门都有员工的所有职位
SELECT job_id, job_title
FROM jobs j
WHERE NOT EXISTS (
    SELECT 1
    FROM departments d
    WHERE NOT EXISTS (
        SELECT 1
        FROM employees e
        WHERE e.department_id = d.department_id
        AND e.job_id = j.job_id
    )
);

7. 非相关子查询

7.1 非相关子查询的特点

  • 独立于外部查询执行
  • 只执行一次,结果用于外部查询
  • 通常比相关子查询执行更快
  • 适用于外部查询前需要确定的静态结果集

7.2 非相关子查询的执行流程

开始
执行内部查询得到结果集
将结果集传递给外部查询
执行外部查询
结束

7.3 常见的非相关子查询场景

-- 查找高于平均薪资的员工
SELECT employee_id, first_name, last_name, salary
FROM employees
WHERE salary > (
    SELECT AVG(salary)
    FROM employees
);

-- 查找在销售额最高的部门工作的员工
SELECT e.employee_id, e.first_name, e.last_name
FROM employees e
WHERE e.department_id = (
    SELECT department_id
    FROM sales
    GROUP BY department_id
    ORDER BY SUM(sales_amount) DESC
    LIMIT 1
);

8. 子查询优化技术

8.1 选择合适的子查询类型

子查询选择
是否只需要一个值?
使用标量子查询
是否需要一列值?
使用列子查询
是否需要临时表?
使用表子查询
是否引用外部查询?
使用相关子查询
使用非相关子查询

8.2 子查询与JOIN的对比

查询方法选择
子查询
JOIN
优点
可读性好
逻辑分离清晰
适合EXISTS操作
缺点
某些情况下性能较低
可能多次扫描同一表
优点
通常性能更好
一次读取所有必要数据
适合大数据集连接
缺点
复杂查询可读性差
可能产生笛卡尔积

8.3 子查询转JOIN优化示例

-- 使用子查询
SELECT 
    product_name,
    price
FROM products
WHERE category_id IN (
    SELECT category_id
    FROM categories
    WHERE category_name LIKE 'Electronics%'
);

-- 转换为JOIN(通常更高效)
SELECT 
    p.product_name,
    p.price
FROM products p
JOIN categories c ON p.category_id = c.category_id
WHERE c.category_name LIKE 'Electronics%';

8.4 常用的子查询优化技巧

  • 尽可能使用非相关子查询替代相关子查询
  • 考虑使用JOIN代替IN子查询
  • 使用EXISTS替代IN进行存在性检查
  • 避免在子查询中使用ORDER BY(除非有LIMIT)
  • 确保子查询引用的列有合适的索引
  • 对表子查询使用有意义的别名提高可读性
  • 考虑使用临时表存储中间结果
-- 使用EXISTS优化查询
-- 低效:
SELECT customer_id, customer_name
FROM customers
WHERE customer_id IN (
    SELECT customer_id FROM orders
);

-- 优化后:
SELECT customer_id, customer_name
FROM customers c
WHERE EXISTS (
    SELECT 1 FROM orders o WHERE o.customer_id = c.customer_id
);

9. 实际应用案例

9.1 销售数据分析

-- 查找每个类别中销售额最高的产品
SELECT 
    c.category_name,
    p.product_name,
    sales.total_sales
FROM products p
JOIN categories c ON p.category_id = c.category_id
JOIN (
    SELECT 
        oi.product_id,
        SUM(oi.quantity * oi.unit_price) AS total_sales
    FROM order_items oi
    JOIN orders o ON oi.order_id = o.order_id
    WHERE o.order_date BETWEEN '2023-01-01' AND '2023-12-31'
    GROUP BY oi.product_id
) sales ON p.product_id = sales.product_id
WHERE (p.category_id, sales.total_sales) IN (
    SELECT 
        p2.category_id,
        MAX(s2.total_sales)
    FROM products p2
    JOIN (
        SELECT 
            oi.product_id,
            SUM(oi.quantity * oi.unit_price) AS total_sales
        FROM order_items oi
        JOIN orders o ON oi.order_id = o.order_id
        WHERE o.order_date BETWEEN '2023-01-01' AND '2023-12-31'
        GROUP BY oi.product_id
    ) s2 ON p2.product_id = s2.product_id
    GROUP BY p2.category_id
)
ORDER BY sales.total_sales DESC;

9.2 人力资源管理

-- 查找部门薪资高于公司平均水平的部门及其员工数量
WITH dept_stats AS (
    SELECT 
        d.department_id,
        d.department_name,
        COUNT(e.employee_id) AS employee_count,
        AVG(e.salary) AS avg_salary
    FROM departments d
    LEFT JOIN employees e ON d.department_id = e.department_id
    GROUP BY d.department_id, d.department_name
)
SELECT 
    ds.department_id,
    ds.department_name,
    ds.employee_count,
    ds.avg_salary,
    (SELECT AVG(salary) FROM employees) AS company_avg,
    ds.avg_salary - (SELECT AVG(salary) FROM employees) AS salary_diff
FROM dept_stats ds
WHERE ds.avg_salary > (SELECT AVG(salary) FROM employees)
ORDER BY salary_diff DESC;

-- 查找每个部门中薪资最接近部门平均值的员工
SELECT 
    d.department_name,
    e.employee_id,
    e.first_name,
    e.last_name,
    e.salary,
    (SELECT AVG(salary) FROM employees WHERE department_id = e.department_id) AS dept_avg
FROM employees e
JOIN departments d ON e.department_id = d.department_id
WHERE e.employee_id IN (
    SELECT e1.employee_id
    FROM employees e1
    WHERE ABS(e1.salary - (
        SELECT AVG(salary) 
        FROM employees 
        WHERE department_id = e1.department_id
    )) <= ALL (
        SELECT ABS(e2.salary - (
            SELECT AVG(salary) 
            FROM employees 
            WHERE department_id = e1.department_id
        ))
        FROM employees e2
        WHERE e2.department_id = e1.department_id
    )
    AND e1.department_id = e.department_id
)
ORDER BY d.department_name;

9.3 电子商务分析

-- 查找购买了所有"热门"产品的客户
WITH popular_products AS (
    SELECT product_id
    FROM order_items
    GROUP BY product_id
    HAVING COUNT(DISTINCT order_id) > 100
)
SELECT 
    c.customer_id,
    c.first_name,
    c.last_name,
    c.email
FROM customers c
WHERE NOT EXISTS (
    SELECT pp.product_id
    FROM popular_products pp
    WHERE NOT EXISTS (
        SELECT 1
        FROM orders o
        JOIN order_items oi ON o.order_id = oi.order_id
        WHERE o.customer_id = c.customer_id
        AND oi.product_id = pp.product_id
    )
)
ORDER BY c.customer_id;

-- 查找每个月购买金额高于当月平均值的客户
WITH monthly_customer_totals AS (
    SELECT 
        YEAR(o.order_date) AS order_year,
        MONTH(o.order_date) AS order_month,
        o.customer_id,
        SUM(oi.quantity * oi.unit_price) AS total_amount
    FROM orders o
    JOIN order_items oi ON o.order_id = oi.order_id
    WHERE o.order_date BETWEEN '2023-01-01' AND '2023-12-31'
    GROUP BY YEAR(o.order_date), MONTH(o.order_date), o.customer_id
),
monthly_averages AS (
    SELECT 
        order_year,
        order_month,
        AVG(total_amount) AS avg_amount
    FROM monthly_customer_totals
    GROUP BY order_year, order_month
)
SELECT 
    c.customer_id,
    c.first_name,
    c.last_name,
    mct.order_year,
    mct.order_month,
    mct.total_amount,
    ma.avg_amount
FROM monthly_customer_totals mct
JOIN monthly_averages ma ON mct.order_year = ma.order_year AND mct.order_month = ma.order_month
JOIN customers c ON mct.customer_id = c.customer_id
WHERE mct.total_amount > ma.avg_amount
ORDER BY mct.order_year, mct.order_month, mct.total_amount DESC;

10. 子查询最佳实践

10.1 可读性与性能平衡

子查询设计
可读性
性能
维护性
使用有意义的别名
结构化和缩进
添加注释
选择合适的子查询类型
考虑转换为JOIN
避免不必要的嵌套
模块化子查询
使用CTE提高可维护性
遵循一致风格

10.2 常见陷阱与避免方法

陷阱避免方法
过度嵌套子查询使用CTE或临时表分解复杂逻辑
使用列子查询返回多列使用行子查询或表子查询替代
忽略NULL值处理使用适当的NULL处理函数和条件
循环引用导致错误仔细规划查询结构避免循环依赖
低效的相关子查询尝试重写为非相关子查询或JOIN

10.3 子查询调试技巧

  • 从内向外逐层测试子查询
  • 使用EXPLAIN分析查询执行计划
  • 隔离复杂子查询为独立查询测试
  • 使用临时变量或会话变量存储中间结果
  • 检查子查询返回的行数和列数
-- 使用EXPLAIN分析子查询执行
EXPLAIN
SELECT 
    e.employee_id,
    e.first_name,
    e.last_name
FROM employees e
WHERE e.department_id IN (
    SELECT department_id
    FROM departments
    WHERE location_id = 1700
);

10.4 子查询与数据量的关系

数据量与子查询选择
小数据量?
子查询通常可接受
数据量巨大?
考虑使用临时表
是否多次重用结果?
使用CTE或派生表
简单存在性检查?
使用EXISTS优化
尝试转换为JOIN

10.5 高级子查询模式

  • 分层汇总报表
  • 间隙分析(查找数据中的"空洞")
  • 行转列和列转行转换
  • 连续范围检测
  • 自定义排名和分析

结语
感谢您的阅读!期待您的一键三连!欢迎指正!

在这里插入图片描述

评论
成就一亿技术人!
拼手气红包6.0元
还能输入1000个字符
 
红包 添加红包
表情包 插入表情
 条评论被折叠 查看
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

【Air】

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值