学习SQL：SQL Server数据透视表

最新推荐文章于 2025-03-20 23:19:52 发布

culuo4781

最新推荐文章于 2025-03-20 23:19:52 发布

阅读量1.7k

点赞数

文章标签：数据库 python java 大数据编程语言

原文链接：https://www.sqlshack.com/learn-sql-sql-server-pivot-tables/

版权

本文深入探讨如何使用SQL Server PIVOT运算符创建报告，从简单查询开始，逐步过渡到动态PIVOT。通过示例展示了如何处理多个表格数据，将城市和呼叫结果组合，形成最终报告，强调了静态和动态PIVOT表的用法，特别是动态SQL在处理未知列数时的优势。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

In the previous few articles in this series, we’ve set the foundations on how to create a report. We’ll take one step further and see how to use the SQL Server PIVOT table operator. We’ll start from the simple query and slowly progress towards dynamic SQL and PIVOT. Let’s start.

在本系列的前几篇文章中，我们为如何创建报告奠定了基础。我们将进一步走一步，看看如何使用SQL Server PIVOT表运算符。我们将从简单的查询开始，然后逐步发展到动态SQL和PIVOT。开始吧。

数据模型和一般思路 (Data model and the general idea)

The data model we’ll use is the same one we’ve been using throughout this series.

我们将使用的数据模型与本系列中一直使用的数据模型相同。

SQL Server PIVOT TABLE - the data model we'll use

Our task today is to create a report (using SQL Server PIVOT operator) where each city will be in a separate row, and where we’ll count the number of all call outcomes related to each city. Therefore, all outcomes (all values from the dictionary) shall be columns in our report.

今天的任务是创建一个报告（使用SQL Server PIVOT运算符），其中每个城市将位于单独的行中，并在其中计算与每个城市相关的所有呼叫结果的数量。因此，所有结果（字典中的所有值）都应在我们的报告中。

Since “a picture is worth a thousand words”, we’ll describe our final goal with the picture too.

由于“一张图片值得一千个单词”，因此我们还将用图片描述我们的最终目标。

To achieve this, we’ll need to use data from 4 tables call, call_outcome, customer, and city. Therefore, it would be good that you take a closer look at the model below and analyze how these tables are related.

为此，我们需要使用来自4个表call ， call_outcome ， customer和city的数据 。因此，最好仔细看一下下面的模型并分析这些表之间的关系。

We’ll start by examining the data in the tables and, with each query, move one step closer to the desired result (query). The approach used here is the same as the one used in Learn SQL: Create a report manually using SQL queries article.

我们将从检查表中的数据开始，并在每次查询时，将其向期望的结果（查询）靠近一步。这里使用的方法与“ 学习SQL：使用SQL查询手动创建报告”文章中使用的方法相同。

报告类别和数据 (Report categories and data)

The first set of queries we’ll start with is the one examining data currently present in all 4 tables we’ll need to use to create the report. We’ll be doing it so we can confirm that the final report returned what it should. After that, we’ll create queries which return reporting categories and the report data, as well as SQL Server PIVOT table queries.

我们将开始的第一组查询是检查所有用于创建报告的所有4个表中当前存在的数据。我们将这样做，以便可以确认最终报告是否返回了应有的结果。之后，我们将创建查询，这些查询返回报告类别和报告数据以及SQL Server PIVOT表查询。

-- 1 -- select data we need
SELECT * FROM call;
SELECT * FROM call_outcome;
SELECT * FROM customer;
SELECT * FROM city;

These queries are as simple as they could be, so there is nothing special to comment regarding their syntax. Regarding the data returned, we should be aware of the following:

这些查询非常简单，因此就语法而言，没有什么特别的注释。关于返回的数据，我们应该注意以下几点：

The city table contains 6 cities, and therefore we should have 6 rows in the final report
城市表包含6个城市，因此最终报告中应有6行
call_outcome table, so we should have 3 columns for outcomes (4 columns total -> one is for city name) call_outcome表中可能有3个结果，因此我们应该有3列用于结果（总共4列->一列用于城市名称）
call has 10 rows (only 8 on the picture above), so in the final table, the sum of all outcomes should be 10 (each call can have exactly 1 outcome) 调用包含10行（上图中只有8行），因此在最终表中，所有结果的总和应为10（每个调用可以恰好具有1个结果）

The next thing to do is to prepare reporting categories. We want to have a combination of all cities and all possible call outcomes. To achieve that, we’ll use the CROSS JOIN (Cartesian product).

接下来要做的是准备报告类别。我们希望将所有城市和所有可能的通话结果结合在一起。为此，我们将使用CROSS JOIN（笛卡尔积）。

-- 2 -- report categories
SELECT 
  c.id AS city_id,
  c.city_name,
  co.id AS call_outcome_id,
  co.outcome_text
FROM call_outcome co
CROSS JOIN city c
ORDER BY
  c.id ASC,
  co.id ASC;

Having all the reporting categories shall guarantee that we’ll have a row in the report, no matter if that pair has data or not. And that is what we want – to see 0 in that reporting category, and not to miss that category entirely.

具有所有报告类别将确保我们将在报告中保留一行，无论该对是否有数据。这就是我们想要的-在该报告类别中看到0，而不是完全错过该类别。

The next thing to do is joining all 4 tables containing the data we need.

下一步是将包含我们所需数据的所有4个表联接在一起。

-- 3 -- report data
SELECT 
  ci.id AS city_id,
  co.id AS call_outcome_id,
  DATEDIFF(SECOND, c.start_time, c.end_time) AS call_duration
FROM call c
INNER JOIN call_outcome co ON c.call_outcome_id = co.id
INNER JOIN customer cu ON c.customer_id = cu.id
INNER JOIN city ci ON cu.city_id = ci.id;

The query itself is not complex. In the result, we should notice that we have 10 rows, the same number as the number of calls we have in the database. Also, I’ve used the DATEDIFF function to return the duration of each call in seconds. I could simply put call.id here, but I wanted to remind ourselves of the DATEDIFF function we’ve mentioned in this article. We could use this duration if we need a SUM/AVG of seconds spent per each reporting category (city & call outcome).

查询本身并不复杂。结果，我们应该注意到，我们有10行，与数据库中的调用数相同。另外，我使用了DATEDIFF函数以秒为单位返回每个调用的持续时间。我可以简单地将call.id放在此处，但是我想提醒自己我们在本文中提到的DATEDIFF函数。如果我们需要为每个报告类别（城市和通话结果）花费几秒钟的SUM / AVG，则可以使用此持续时间。

没有SQL Server PIVOT表的报表 (Report without SQL Server PIVOT TABLE)

Now we’re ready to join categories and data. We’ll use both previously mentioned queries as subqueries and join them using LEFT JOIN (in order to have all the categories present in the final output).

现在我们准备加入类别和数据。我们将使用前面提到的两个查询作为子查询，并使用LEFT JOIN将它们联接（以使所有类别都出现在最终输出中）。

-- 4 -- report categories & data (without pivot)
SELECT
  rc.city_id,
  rc.city_name,
  rc.call_outcome_id,
  rc.outcome_text,
  rd.call_duration
FROM
(
  SELECT 
    c.id AS city_id,
    c.city_name,
    co.id AS call_outcome_id,
    co.outcome_text
  FROM call_outcome co
  CROSS JOIN city c
) rc
 
LEFT JOIN
 
(
  SELECT 
    ci.id AS city_id,
    co.id AS call_outcome_id,
    DATEDIFF(SECOND, c.start_time, c.end_time) AS call_duration
  FROM call c
  INNER JOIN call_outcome co ON c.call_outcome_id = co.id
  INNER JOIN customer cu ON c.customer_id = cu.id
  INNER JOIN city ci ON cu.city_id = ci.id
) rd ON rc.city_id = rd.city_id AND rc.call_outcome_id = rd.call_outcome_id
 
ORDER BY
  rc.city_id,
  rc.call_outcome_id;

We have all 18 combinations (6 different cities * 3 different call outcomes), and we also have all 10 calls present here (rows with the call duration <> NULL).

我们有18个组合（6个不同的城市* 3个不同的通话结果），并且这里还存在所有10个通话（通话持续时间<> NULL的行）。

Let’s now create the SQL Server PIVOT query.

现在让我们创建SQL Server PIVOT查询。

SQL Server PIVOT表格（静态） (SQL Server PIVOT TABLE (static))

So far, we’ve managed to pull the data we need, and we have them as the list. We could export data to Excel and make transformations there. We’ll actually do this in the upcoming article. Still, today, we want to solve this using SQL Server PIVOT operator. In this section, we’ll cover the “static” SQL Server PIVOT. Let’s take a look at the query used and the result first.

到目前为止，我们已经成功提取了所需的数据，并将其作为列表。我们可以将数据导出到Excel并在那里进行转换。实际上，我们将在下一篇文章中进行此操作。仍然，今天，我们仍想使用SQL Server PIVOT运算符解决此问题。在本节中，我们将介绍“静态” SQL Server PIVOT。让我们先看看所使用的查询和结果。

-- 5 -- report (including static PIVOT)
SELECT * FROM   
(
  SELECT
    -- rc.city_id,
    rc.city_name,
    -- rc.call_outcome_id,
    rc.outcome_text,
    rd.call_duration
  FROM
  (
    SELECT 
      c.id AS city_id,
      c.city_name,
      co.id AS call_outcome_id,
      co.outcome_text
    FROM call_outcome co
    CROSS JOIN city c
  ) rc
 
  LEFT JOIN
 
  (
    SELECT 
      ci.id AS city_id,
      co.id AS call_outcome_id,
      DATEDIFF(SECOND, c.start_time, c.end_time) AS call_duration
    FROM call c
    INNER JOIN call_outcome co ON c.call_outcome_id = co.id
    INNER JOIN customer cu ON c.customer_id = cu.id
    INNER JOIN city ci ON cu.city_id = ci.id
  ) rd ON rc.city_id = rd.city_id AND rc.call_outcome_id = rd.call_outcome_id
) report_data 
PIVOT(
    COUNT(call_duration) 
    FOR outcome_text IN (
        [call started], 
        [finished - successfully], 
        [finished - unsuccessfully])
) AS pivot_table;

We can notice that the result is exactly what we wanted. We have each city in one row and all 3 categories for call outcomes in separate columns. If there is no data for a certain city-outcome pair, that cell shall contain the value 0.

我们可以注意到结果正是我们想要的。我们将每一座城市排成一排，并将所有3个类别的呼叫结果放在单独的列中。如果没有关于某个城市结果对的数据，则该小区应包含值0。

Let’s comment on the SQL Server PIVOT query now. These are a few things I find important to mention here:

现在让我们对SQL Server PIVOT查询进行评论。这些是我在这里需要提及的重要事项：

In the first part of the query – report_data, we’ve copy-pasted the query from the previous section. The only change was that we had no ORDER BY and GROUP BY in this part of the query because they can’t be applied here (they should go after the final query result is generated)
在查询的第一部分– report_data中，我们从上一节中复制粘贴了该查询。唯一的变化是我们在查询的这一部分中没有ORDER BY和GROUP BY，因为它们不能在此处应用（它们应在生成最终查询结果之后使用）
The PIVOT part of the query consists of 2 parts. In the first, we’ll define which aggregate function we want to apply. In our case, this is – COUNT(call_duration). In FOR part of the query, we’ll define columns. We literary list all the values we want to have as columns. This is hard-coding. If we add new value to the table, that won’t impact the query (and it should)
查询的PIVOT部分由2部分组成。首先，我们将定义要应用的聚合函数。在我们的例子中，这是– COUNT（call_duration）。在查询的FOR部分，我们将定义列。我们在文学中将要拥有的所有值列为列。这是硬编码。如果我们向表中添加新值，那将不会影响查询（它应该）

This query does its job, and it will work perfectly if we never change (add new, delete, update their names) call outcomes in the related table. That could prove to be the problem in cases we need to make changes in the dictionary. We don’t want to think if there is a query that doesn’t work as expected. To solve this, we’ll need to use a new concept – dynamic SQL.

该查询可以完成工作，并且如果我们从不更改（在相关表中）更改（添加，删除，更新其名称）调用结果，它将可以完美地工作。在我们需要对字典进行更改的情况下，这可能会成为问题。我们不想考虑是否存在无法按预期运行的查询。为了解决这个问题，我们需要使用一个新概念–动态SQL。

动态SQL Server PIVOT表 (Dynamic SQL Server PIVOT TABLE)

Before moving to the code, let’s shortly explain what the dynamic SQL really is (we’ll give a much more detailed review of it in upcoming articles). The simplest explanation is that in SQL, you can “build” your queries as strings, and then pass that string as a parameter for the SQL Server stored procedure. This procedure shall execute the SQL statement(s) stored in that string (of course, if the syntax is OK).

在转向代码之前，让我们简短地解释一下动态SQL的真正含义（我们将在以后的文章中对其进行更详细的介绍）。最简单的解释是，在SQL中，您可以将查询“构建”为字符串，然后将该字符串作为SQL Server存储过程的参数传递。该过程将执行存储在该字符串中SQL语句（当然，如果语法正确）。

While this might sound like something not so commonly used, it has quite a few places where it makes your life much easier. Creating a SQL Server PIVOT table query with an unknown number of columns is exactly one such case.

虽然这听起来好像不太常用，但它在很多地方使您的生活更加轻松。创建具有未知列数SQL Server PIVOT表查询正是这种情况。

-- 6 -- report (including dynamic PIVOT)
DECLARE 
    @columns  NVARCHAR(MAX) = '', 
    @query    NVARCHAR(MAX) = '';
 
-- get all column names for the table we need for pivot
SELECT 
  @columns += QUOTENAME(TRIM(co.outcome_text)) + ','
FROM 
  call_outcome co
ORDER BY 
  co.outcome_text;
 
-- remove "," from the end of the string
SET @columns = LEFT(@columns, LEN(@columns) - 1);
 
-- dynamic SQL query
SET @query ='
SELECT * FROM   
(
  SELECT
    -- rc.city_id,
    rc.city_name,
    -- rc.call_outcome_id,
    rc.outcome_text,
    rd.call_duration
  FROM
  (
    SELECT 
      c.id AS city_id,
      c.city_name,
      co.id AS call_outcome_id,
      co.outcome_text
    FROM call_outcome co
    CROSS JOIN city c
  ) rc
 
  LEFT JOIN
 
  (
    SELECT 
      ci.id AS city_id,
      co.id AS call_outcome_id,
      DATEDIFF(SECOND, c.start_time, c.end_time) AS call_duration
    FROM call c
    INNER JOIN call_outcome co ON c.call_outcome_id = co.id
    INNER JOIN customer cu ON c.customer_id = cu.id
    INNER JOIN city ci ON cu.city_id = ci.id
  ) rd ON rc.city_id = rd.city_id AND rc.call_outcome_id = rd.call_outcome_id
) report_data 
PIVOT(
    COUNT(call_duration) 
    FOR outcome_text IN ('+ @columns +')
) AS pivot_table;';
 
-- execute dynamic query
EXECUTE sp_executesql @query;

The idea of this approach is as follows:

这种方法的想法如下：

We’ll declare a variable where to store all column names (@columns), and the variable where to store the complete query (@query)
我们将声明一个用于存储所有列名称的变量（@columns），以及用于存储完整查询的变量（@query）
Using the SELECT query, we’ll find all values stored in the outcome_text column. The following code – @columns += QUOTENAME(TRIM(co.outcome_text)) + ‘,’, shall append column name to the list of all previous column names returned by the query. The result is that we have all column names stored in the variable @columns
使用SELECT查询，我们将找到存储在result_text列中的所有值。以下代码– @columns + = QUOTENAME（TRIM（co.outcome_text））+'，'将列名附加到查询返回的所有先前列名的列表中。结果是我们将所有列名称存储在变量@columns中
In the @query variable, we’ll store the complete query from the previous section, except for the part where columns are defined. We’ll get this part from the variable @columns
在@query变量中，我们将存储上一部分的完整查询，但定义列的部分除外。我们将从变量@columns中获得这一部分
The last thing we need to do is to pass the complete query as the parameter to the SQL Server system procedure sp_executesql
我们需要做的最后一件事是将完整的查询作为参数传递给SQL Server系统过程sp_executesql

SQL Server PIVOT TABLE - dynamic SQL query

The final result is the same as in the previous section, but this time we’re sure our query will work even if we make changes to the outcome.outcome_text values. Also, you can easily modify this query and calculate any other values, e.g., SUM/AVG call duration per city-outcome pair.

最终结果与上一节中的结果相同，但是这次我们确定，即使我们对result.outcome_text值进行了更改，我们的查询仍然可以使用。同样，您可以轻松修改此查询并计算任何其他值，例如，每个城市结果对的SUM / AVG通话时间。

结论 (Conclusion)

The SQL Server PIVOT operator gives you a completely new view of what you can achieve directly on the database layer. When combined with the dynamic SQL, this goes even further. I strongly encourage you to play with it – there is no better way of learning than to try it yourself. In the next article, we’ll show how we could use SQL query output and create tables and graphs in Excel (this shall work not only in SQL Server but generally).

SQL Server PIVOT运算符为您提供了可以在数据库层上直接实现的功能的全新视图。当与动态SQL结合使用时，这甚至更进一步。我强烈建议您尝试使用它-没有比尝试自己更好的学习方法了。在下一篇文章中，我们将展示如何使用SQL查询输出并在Excel中创建表和图形（这不仅适用于SQL Server，而且通常也适用）。