As a data engineer, a typical working day for me, apart from meetings, is full of SELECT, FROM and WHERE. But these basic statements are not enough, especially for the complex ad hoc analysis that is increasingly common nowadays.

SQL is a powerful language. It is a declarative language where we define what we want and the engine finds a way to achieve it. The language is evolving to adapt to the increasing variety of analysis needs. I wrote an article about an advanced SQL feature to deal with hierarchical data. And today, let’s explore another beyond-the-basic feature: PIVOT.

Problem Statement

Imagine you are working as a data engineer for a retail company. The company wants to analyze product sales data to identify trends and opportunities for growth. The data is stored in a table called Sales with the following structure:

ProductIDDateAmount
1012024-01-10300
1012024-12-15500
1012025-01-15700
1012025-02-011100
1022024-02-20800
1022024-11-03400
1022025-01-20900
1022025-02-22650
1032023-07-251200
1032024-08-151500
1032025-02-101250
1042023-12-05400
1042024-06-30800
1042025-01-30300
1042025-02-25500

This structure is not good for reports. The company wants this data served in a format where years are represented as columns for easier comparison across products.

GROUP BY - The Amateur Way

A very straightforward approach to this problem is to use the GROUP BY statement. We will group by ProductID, and we will get the sum column for each month. Below is a SQL Server example. Other SQL engines should have similar syntax.

select
    ProductID
    ,sum(iif(year(Date)=2023, Amount, null)) as [2023]
    ,sum(iif(year(Date)=2024, Amount, null)) as [2024]
    ,sum(iif(year(Date)=2025, Amount, null)) as [2025]
from Sales
group by ProductID;

GROUP BY query result
ProductID202320242025
101null8001800
102null12001550
103120015001250
104400800800

At first glance, it’s simple, and it works. Sometimes just working is enough.

SQL PIVOT - The Complex Way

PIVOT is a operator in SQL that allows you to transform rows into columns. This transformation is particularly useful when summarizing data and creating a more interpretable format for analysis. The below SQL Server query achieves similar results:

select 
    *
from (
    select 
        ProductID, 
        year(Date) as Year, 
        Amount 
    from Sales
) as ToPivotSales pivot (
    sum(Amount) for Year in ([2023], [2024], [2025])
) as PivotedSales

PIVOT query result
ProductID202320242025
101null8001800
102null12001550
103120015001250
104400800800

To do PIVOT, we first need a subquery to specify the columns we need. In this case they are ProductID, Year and Amount. If you don’t like subqueries, CTE (Common Table Expression) works as well.

In the PIVOT decalaration:

  • sum(Amount): specifies that the aggregation function to be applied is sum, which will sum the Amount values.
  • for Year in ([2023], [2024], [2025]): defines how the pivoting will occur:
    • for Year: indicates that the values in the Year column will be used to create new columns in the result set.
    • in ([2023], [2024], [2025]): specifies the specific years that will become the new columns in the result. Each of these years will have a corresponding column that contains the summed Amount for that year.

To be honest, I don’t like the syntax of PIVOT. It is terrible to me. It involves subqueries and CTEs, it uses more levels of indentation, and it’s harder to scan through. And if we look at the execution plan, it is not really faster than GROUP BY.

With that being said, PIVOT still has advantages over GROUP BY that it requires less boilerplate code. In the examples above, to add a new year to the query with GROUP BY, you have to copy, paste, and edit in 2 places. With PIVOT, all you have to do is add a new value to the list. This makes PIVOT shine in situations where we have to deal with a long list of values.

Dynamic PIVOT - The Hacker Way

While both GROUP BY and PIVOT are useful, they have the limitation that you must explicitly specify the list of values. When the data is small, this is fine. But it becomes a problem when dealing with evolving, large data where you do not know or do not want to manually list all possible values. Imagine you build a report of annual sales, you clearly don’t want to update the query every year.

Dynamic PIVOT is a technique that allows us to pivot data without hard-coding the pivot column values. It is not a standard SQL operation. Therefore, the syntax may vary between different SQL engines. In Snowflake SQL, you can achieve dynamic pivoting by something as simple as this.

...
pivot (
    sum(Amount) for Year in (any order by Year)
)

Most other SQL engines don’t offer this level of simplicity. They require a bit of complexity. Below is a SQL Server query example.

declare
    @cols as nvarchar(max),
    @query as nvarchar(max);


select @cols = string_agg(quotename(Year), ',') within group (order by Year)
from (select distinct year(Date) as Year from Sales) as YearList;


set @query = (
    'select *
    from (
        select ProductID, year(Date) as Year, Amount
        from Sales
    ) as ToPivotSales pivot (
        sum(Amount) for Year in (' + @cols + ')
    ) as PivotedSales;'
);


exec sp_executesql @query;

Dynamic PIVOT query result
ProductID202320242025
101null8001800
102null12001550
103120015001250
104400800800

Although the code is not pleasing to eyes, the concept is simple. We need an additional step to find all the unique values and store them in a variable. Then we construct the query with the above list of unique values. Execute it and we get the expected result.

Final Thought

Mastering SQL is not easy. SQL is more than just a query language; it’s a powerful tool that, when used effectively, can turn raw data into actionable intelligence. Advanced SQL techniques such as PIVOT are not very common in our day-to-day work. But we should know what we have in our toolbox. So that we can quickly select the right tool from the toolbox where the right job applies.

* You can find the execution of SQL query examples in this post at https://dbfiddle.uk/2jD1lHkL