SQL是每个求职者必备技能,无论是互联网大厂的数据分析师,还是金融、零售行业的数据岗位,SQL都是面试的核心考点。据统计,超过85%的数据岗位要求应聘者具备扎实的SQL能力。
为帮助大家高效备战,本节整理30个高频SQL考点,涵盖基础查询、表连接、窗口函数、性能优化等考点,并提供详细解析和实战示例,希望通过本节内容可以让你精准突破SQL面试!
基础查询与筛选
1. 如何从表中选取特定列的所有数据?
使用SELECT语句指定需要查询的列名,FROM子句指定表名。
SELECT employee_id, first_name, last_name, hire_date
FROM employees;
2. 如何筛选满足特定条件的记录?
在WHERE子句中添加条件表达式来过滤数据。
SELECT product_name, price
FROM products
WHERE price > 100 AND category = 'Electronics';
3. 如何按某列的值对结果进行排序?
使用ORDER BY子句,可以指定升序(ASC)或降序(DESC)。
SELECT customer_name, total_purchases
FROM customers
ORDER BY total_purchases DESC
LIMIT 10;
聚合函数与分组
4. 常用的SQL聚合函数有哪些?如何使用?
主要聚合函数包括COUNT, SUM, AVG, MAX, MIN等,用于计算汇总数据。
SELECT
COUNT(*) AS total_orders,
SUM(amount) AS total_revenue,
AVG(amount) AS average_order_value,
MIN(order_date) AS first_order_date,
MAX(order_date) AS last_order_date
FROM orders;
5. 如何按某列分组并计算聚合值?
使用GROUP BY子句对数据进行分组,然后应用聚合函数。
SELECT
department_id,
COUNT(*) AS employee_count,
ROUND(AVG(salary), 2) AS avg_salary,
SUM(salary) AS total_salary_cost
FROM employees
GROUP BY department_id;
6. 如何筛选分组后的结果?
WHERE过滤原始行,HAVING过滤分组后的结果。
SELECT
product_category,
COUNT(*) AS product_count,
AVG(price) AS avg_price
FROM products
WHERE in_stock = TRUE
GROUP BY product_category
HAVING COUNT(*) > 5 AND AVG(price) < 500;
表连接
7. 请解释INNER JOIN并举例
INNER JOIN只返回两表中匹配的行。
SELECT o.order_id, o.order_date, c.customer_name, c.email
FROM orders o
INNER JOIN customers c ON o.customer_id = c.customer_id
WHERE o.order_date >= '2023-01-01';
8. LEFT JOIN和RIGHT JOIN有什么区别?
LEFT JOIN保留左表所有行,RIGHT JOIN保留右表所有行。
-- LEFT JOIN示例
SELECT e.employee_name, d.department_name
FROM employees e
LEFT JOIN departments d ON e.department_id = d.department_id;
-- RIGHT JOIN示例
SELECT e.employee_name, d.department_name
FROM employees e
RIGHT JOIN departments d ON e.department_id = d.department_id;
9. 如何实现多表连接?
可以链式连接多个表,确保每个连接有明确的关系条件。
SELECT
o.order_id,
c.customer_name,
p.product_name,
oi.quantity,
oi.unit_price
FROM orders o
JOIN customers c ON o.customer_id = c.customer_id
JOIN order_items oi ON o.order_id = oi.order_id
JOIN products p ON oi.product_id = p.product_id;
子查询
10. 什么是子查询?举例说明
子查询是嵌套在主查询中的查询,可以用在SELECT, FROM, WHERE等子句中。
SELECT employee_name, salary
FROM employees
WHERE salary > (SELECT AVG(salary) FROM employees);
11. 如何使用IN操作符与子查询?
IN操作符用于检查某值是否在子查询结果集中。
SELECT product_name, price
FROM products
WHERE category_id IN (
SELECT category_id
FROM categories
WHERE department = 'Electronics'
)
ORDER BY price DESC;
12. 如何使用EXISTS子查询?
EXISTS检查子查询是否返回任何行,常用于相关子查询。
SELECT s.supplier_name, s.contact_email
FROM suppliers s
WHERE EXISTS (
SELECT 1
FROM products p
WHERE p.supplier_id = s.supplier_id
AND p.price < 50
);
窗口函数
13. 什么是窗口函数?举例说明
窗口函数在不减少行数的情况下执行计算,常用于排名、移动平均等分析。
SELECT
employee_id,
employee_name,
department,
salary,
RANK() OVER (PARTITION BY department ORDER BY salary DESC) AS dept_salary_rank,
salary - LAG(salary) OVER (PARTITION BY department ORDER BY hire_date) AS salary_diff_from_prev_hire
FROM employees;
14. 如何使用ROW_NUMBER(), RANK()和DENSE_RANK()?
这三个函数都用于分配排名,但在处理并列排名时行为不同。
SELECT
product_id,
product_name,
sales_amount,
ROW_NUMBER() OVER (ORDER BY sales_amount DESC) AS row_num,
RANK() OVER (ORDER BY sales_amount DESC) AS rank,
DENSE_RANK() OVER (ORDER BY sales_amount DESC) AS dense_rank
FROM products;
15. 如何计算移动平均?
使用窗口函数定义框架来计算移动统计量。
SELECT
date,
daily_sales,
AVG(daily_sales) OVER (ORDER BY date ROWS BETWEEN 6 PRECEDING AND CURRENT ROW) AS weekly_moving_avg,
AVG(daily_sales) OVER (ORDER BY date ROWS BETWEEN 29 PRECEDING AND CURRENT ROW) AS monthly_moving_avg
FROM sales_data;
日期处理
16. 如何提取日期部分(年、月、日)?
使用日期函数提取特定的日期部分。
SELECT
order_id,
order_date,
EXTRACT(YEAR FROM order_date) AS order_year,
EXTRACT(MONTH FROM order_date) AS order_month,
DATE_TRUNC('month', order_date) AS month_start
FROM orders;
17. 如何计算两个日期之间的差异?
使用日期函数计算日期或时间间隔。
SELECT
employee_id,
hire_date,
CURRENT_DATE AS today,
CURRENT_DATE - hire_date AS days_employed,
DATE_PART('year', AGE(CURRENT_DATE, hire_date)) AS years_employed
FROM employees;
高级查询技巧
18. 如何使用CASE WHEN语句?
CASE WHEN实现条件逻辑,常用于数据分类和转换。
SELECT
product_name,
price,
CASE
WHEN price < 50 THEN 'Budget'
WHEN price BETWEEN 50 AND 200 THEN 'Mid-range'
WHEN price > 200 THEN 'Premium'
ELSE 'Unknown'
END AS price_segment,
CASE
WHEN stock_quantity < 10 THEN 'Low stock'
WHEN stock_quantity > 100 THEN 'Overstocked'
ELSE 'Adequate stock'
END AS stock_status
FROM products;
数据修改操作
19. 如何插入新记录到表中?
使用INSERT INTO语句添加新行,可以一次插入单行或多行数据。
-- 单行插入
INSERT INTO employees (first_name, last_name, department, salary)
VALUES ('John', 'Doe', 'Marketing', 65000);
-- 多行插入
INSERT INTO products (product_name, category, price, stock)
VALUES
('Laptop', 'Electronics', 999.99, 50),
('Desk Chair', 'Furniture', 149.99, 30),
('Notebook', 'Stationery', 4.99, 200);
20. 如何更新表中的现有记录?
使用UPDATE语句修改数据,WHERE子句指定要更新的行。
-- 更新特定行
UPDATE customers
SET email = 'new.email@example.com', membership_level = 'Gold'
WHERE customer_id = 12345;
-- 基于条件批量更新
UPDATE products
SET price = price * 1.1 -- 价格上涨10%
WHERE category = 'Electronics' AND last_price_update < '2023-01-01';
21. 如何删除表中的记录?
使用DELETE FROM语句删除行,WHERE子句指定要删除的行。
-- 删除特定行
DELETE FROM orders
WHERE order_id = 7890;
-- 基于条件批量删除
DELETE FROM inactive_users
WHERE last_login_date < CURRENT_DATE - INTERVAL '2 years';
表管理
22. 如何创建新表?
使用CREATE TABLE语句定义表结构,包括列名、数据类型和约束。
CREATE TABLE employees (
employee_id SERIAL PRIMARY KEY,
first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NOT NULL,
email VARCHAR(100) UNIQUE,
hire_date DATE DEFAULT CURRENT_DATE,
department VARCHAR(50),
salary DECIMAL(10,2) CHECK (salary > 0),
manager_id INTEGER REFERENCES employees(employee_id)
);
CREATE TABLE orders (
order_id BIGSERIAL PRIMARY KEY,
customer_id INTEGER REFERENCES customers(customer_id),
order_date TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
status VARCHAR(20) CHECK (status IN ('Pending', 'Shipped', 'Delivered', 'Cancelled')),
total_amount DECIMAL(12,2)
);
23. 如何修改表结构?
使用ALTER TABLE语句添加、修改或删除列和约束。
-- 添加新列
ALTER TABLE products
ADD COLUMN weight_kg DECIMAL(5,2);
-- 修改列数据类型
ALTER TABLE customers
ALTER COLUMN phone_number TYPE VARCHAR(20);
-- 添加约束
ALTER TABLE orders
ADD CONSTRAINT fk_customer FOREIGN KEY (customer_id) REFERENCES customers(customer_id);
-- 删除列
ALTER TABLE employees
DROP COLUMN home_phone;
高级分析查询
24. 如何计算累积总和?
使用窗口函数计算累积汇总值。
SELECT
transaction_date,
amount,
SUM(amount) OVER (ORDER BY transaction_date) AS running_total,
SUM(amount) OVER (PARTITION BY EXTRACT(MONTH FROM transaction_date)
ORDER BY transaction_date) AS monthly_running_total
FROM transactions
WHERE account_id = 12345;
25. 如何识别数据中的异常值?
使用统计方法和窗口函数识别异常值。
WITH product_stats AS (
SELECT
product_id,
product_name,
daily_sales,
AVG(daily_sales) OVER () AS overall_avg,
STDDEV(daily_sales) OVER () AS overall_stddev
FROM product_daily_sales
)
SELECT
product_id,
product_name,
daily_sales,
overall_avg,
overall_stddev,
(daily_sales - overall_avg) / overall_stddev AS z_score
FROM product_stats
WHERE ABS((daily_sales - overall_avg) / overall_stddev) > 3; -- 3标准差以外的异常值
26. 如何进行数据透视(行转列)?
使用CASE WHEN或PIVOT函数实现行转列转换。
-- 使用CASE WHEN实现数据透视
SELECT
product_id,
SUM(CASE WHEN region = 'North' THEN sales_amount ELSE 0 END) AS north_sales,
SUM(CASE WHEN region = 'South' THEN sales_amount ELSE 0 END) AS south_sales,
SUM(CASE WHEN region = 'East' THEN sales_amount ELSE 0 END) AS east_sales,
SUM(CASE WHEN region = 'West' THEN sales_amount ELSE 0 END) AS west_sales,
SUM(sales_amount) AS total_sales
FROM regional_sales
GROUP BY product_id
ORDER BY total_sales DESC;
性能优化
27. 如何优化慢查询?
通过索引、查询重写和适当的分区优化查询性能。
-- 创建适当的索引
CREATE INDEX idx_orders_customer_date ON orders(customer_id, order_date DESC);
-- 重写复杂查询为更高效的形式
-- 原低效查询:
SELECT * FROM orders WHERE EXTRACT(YEAR FROM order_date) = 2023;
-- 优化后查询:
SELECT * FROM orders
WHERE order_date >= '2023-01-01' AND order_date < '2024-01-01';
-- 使用EXPLAIN分析查询计划
EXPLAIN ANALYZE
SELECT c.customer_name, COUNT(o.order_id) AS order_count
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
GROUP BY c.customer_name
ORDER BY order_count DESC
LIMIT 10;
28. 什么是CTE?如何使用?
CTE(公用表表达式)使用WITH子句创建临时结果集,提高复杂查询的可读性。
-- 计算各部门薪资统计
WITH department_stats AS (
SELECT
department,
COUNT(*) AS employee_count,
AVG(salary) AS avg_salary,
MAX(salary) AS max_salary
FROM employees
GROUP BY department
),
company_stats AS (
SELECT
COUNT(*) AS total_employees,
AVG(salary) AS company_avg_salary
FROM employees
)
SELECT
d.department,
d.employee_count,
d.avg_salary,
d.max_salary,
ROUND((d.avg_salary - c.company_avg_salary) / c.company_avg_salary * 100, 2) AS percentage_diff_from_avg
FROM department_stats d
CROSS JOIN company_stats c
ORDER BY d.avg_salary DESC;
实战案例分析
29. 如何计算用户留存率?
通过比较用户首次活跃日期和后续活跃日期计算留存。
WITH user_first_activity AS (
SELECT
user_id,
MIN(activity_date) AS first_active_date
FROM user_activities
GROUP BY user_id
),
retention_data AS (
SELECT
ufa.first_active_date AS cohort,
ufa.user_id,
ua.activity_date,
EXTRACT(DAY FROM ua.activity_date - ufa.first_active_date) AS day_difference
FROM user_first_activity ufa
JOIN user_activities ua ON ufa.user_id = ua.user_id
)
SELECT
cohort,
COUNT(DISTINCT user_id) AS cohort_size,
COUNT(DISTINCT CASE WHEN day_difference = 1 THEN user_id END) AS day1_retained,
ROUND(COUNT(DISTINCT CASE WHEN day_difference = 1 THEN user_id END) * 100.0 /
COUNT(DISTINCT user_id), 2) AS day1_retention_rate,
COUNT(DISTINCT CASE WHEN day_difference = 7 THEN user_id END) AS day7_retained,
ROUND(COUNT(DISTINCT CASE WHEN day_difference = 7 THEN user_id END) * 100.0 /
COUNT(DISTINCT user_id), 2) AS day7_retention_rate,
COUNT(DISTINCT CASE WHEN day_difference = 30 THEN user_id END) AS day30_retained,
ROUND(COUNT(DISTINCT CASE WHEN day_difference = 30 THEN user_id END) * 100.0 /
COUNT(DISTINCT user_id), 2) AS day30_retention_rate
FROM retention_data
GROUP BY cohort
ORDER BY cohort;
30. 如何分析销售漏斗转化率?
跟踪用户在各阶段的转化情况,计算转化率。
WITH funnel_steps AS (
-- 定义漏斗各阶段
SELECT 'Visited Website' AS step, COUNT(DISTINCT user_id) AS user_count FROM website_visits WHERE visit_date = CURRENT_DATE
UNION ALL
SELECT 'Product Viewed' AS step, COUNT(DISTINCT user_id) AS user_count FROM product_views WHERE view_date = CURRENT_DATE
UNION ALL
SELECT 'Added to Cart' AS step, COUNT(DISTINCT user_id) AS user_count FROM cart_additions WHERE addition_date = CURRENT_DATE
UNION ALL
SELECT 'Started Checkout' AS step, COUNT(DISTINCT user_id) AS user_count FROM checkout_starts WHERE start_date = CURRENT_DATE
UNION ALL
SELECT 'Completed Purchase' AS step, COUNT(DISTINCT user_id) AS user_count FROM completed_purchases WHERE purchase_date = CURRENT_DATE
),
funnel_with_prev AS (
SELECT
step,
user_count,
LAG(user_count) OVER (ORDER BY CASE step
WHEN 'Visited Website' THEN 1
WHEN 'Product Viewed' THEN 2
WHEN 'Added to Cart' THEN 3
WHEN 'Started Checkout' THEN 4
WHEN 'Completed Purchase' THEN 5
END) AS prev_count
FROM funnel_steps
)
SELECT
step,
user_count,
prev_count,
ROUND(user_count * 100.0 /
FIRST_VALUE(user_count) OVER (ORDER BY CASE step
WHEN 'Visited Website' THEN 1
WHEN 'Product Viewed' THEN 2
WHEN 'Added to Cart' THEN 3
WHEN 'Started Checkout' THEN 4
WHEN 'Completed Purchase' THEN 5
END), 2) AS overall_conversion_rate,
ROUND(user_count * 100.0 / NULLIF(prev_count, 0), 2) AS step_conversion_rate
FROM funnel_with_prev
ORDER BY CASE step
WHEN 'Visited Website' THEN 1
WHEN 'Product Viewed' THEN 2
WHEN 'Added to Cart' THEN 3
WHEN 'Started Checkout' THEN 4
WHEN 'Completed Purchase' THEN 5
END;
掌握SQL不仅是为了通过面试,更是成为数据分析师的基础能力,本文整理的30个SQL面试题覆盖90%以上的企业考核点,建议你结合实际业务场景多加练习,这样才能更好地应用SQL。