SQL Master: PIVOT Multiple Columns

In one of my previous post I showed you how to UNPIVOT multiple columns. On similar lines I also wanted to write on “How to PIVOT multiple columns?”, so this post was due for some time. Actually I was looking for some efficient way of doing it. Limitation of PIVOT operator is, it supports pivoting only on a single column. But you can always have multiple PIVOT operators in the FROM clause. I was trying to create a PIVOT query with multiple columns with multiple PIVOT operators. But at the end of it I found that our old fashioned CASE expression is performing much better than a multiple PIVOT operator query.

Even though I’m writing this post on how to write a multiple PIVOT operator query, my suggestion is use CASE expressions instead for getting better performance. Though personally I like to avoid CASE also. Normally I like to do it in Reporting Services, by creating a Matrix report. Now a days almost all Reporting Tools provides you an option of creating Matrix report. And good thing about Matrix report is unlike PIVOT operator you don’t need to hard code any column value.

If you try to write a PIVOT query with 2 PIVOT operators, and use same column in FOR clause you will get an error : Invalid column name NameOfColumn.

Or if you use same column, but by declaring it again and using a different alias name, you still get an error : The column name ValueOfColumn specified in the PIVOT operator conflicts with the existing column name in the PIVOT argument.

So what’s the solution? Solution is, declare the same column again, change the values in the column by some constant(you can add some constant, or you can concat some identifier ) and assign a new alias name to column.

Lets see the following example, I have used the AdventureWorks database of SQL Server 2005.


USE AdventureWorks GO SET ANSI_WARNINGS OFF SELECT CustomerId, SUM([Q2001]) AS Qty2001, SUM([Q2002]) AS Qty2002, SUM([V2001]) AS Val2001, SUM([V2002]) AS Val2002 FROM ( SELECT H.CustomerId, SUM(D.OrderQty) AS TotalQty, SUM(D.LineTotal) AS TotalVal, 'Q'+CONVERT(VARCHAR(4),H.OrderDate,120) AS QYear, 'V'+CONVERT(VARCHAR(4),H.OrderDate,120) AS VYear FROM Sales.SalesOrderDetail AS D INNER JOIN Sales.SalesOrderHeader AS H ON D.SalesOrderId = H.SalesOrderId WHERE D.ProductId=771 AND H.OrderDate >='20010101' AND H.OrderDate <'20030101' GROUP BY H.CustomerId, CONVERT(VARCHAR(4),H.OrderDate,120) )Main PIVOT ( SUM(TotalQty) FOR QYear IN ([Q2001],[Q2002]) ) PQ PIVOT ( SUM(TotalVal) FOR VYear IN ([V2001],[V2002]) ) PV GROUP BY CustomerId ORDER BY CustomerId GO

USE AdventureWorks
GO
SET ANSI_WARNINGS OFF
SELECT CustomerId,
        SUM([Q2001]) AS Qty2001,
        SUM([Q2002]) AS Qty2002,
        SUM([V2001]) AS Val2001,
        SUM([V2002]) AS Val2002
FROM (
        SELECT     H.CustomerId,
                SUM(D.OrderQty) AS TotalQty,
                SUM(D.LineTotal) AS TotalVal,
                'Q'+CONVERT(VARCHAR(4),H.OrderDate,120) AS QYear,
                'V'+CONVERT(VARCHAR(4),H.OrderDate,120) AS VYear
        FROM Sales.SalesOrderDetail AS D INNER JOIN
        Sales.SalesOrderHeader AS H ON D.SalesOrderId = H.SalesOrderId
        WHERE D.ProductId=771
        AND H.OrderDate >='20010101'
        AND H.OrderDate <'20030101'
        GROUP BY H.CustomerId,
                CONVERT(VARCHAR(4),H.OrderDate,120)
    )Main
PIVOT
    (
        SUM(TotalQty)
        FOR QYear IN ([Q2001],[Q2002])
    ) PQ
PIVOT
    (
        SUM(TotalVal)
        FOR VYear IN ([V2001],[V2002])
    ) PV
GROUP BY CustomerId
ORDER BY CustomerId
GO

The query returns total quantity and line amount for year 2001 and 2002 for the product id 771 for all customers. If look at the query carefully in Main sub query, CONVERT(VARCHAR(4),H.OrderDate,120) this convert statement will take out the Year part from the OrderDate column. I have declared the same column twice, at first I concatenated Q to the Year, and at second time I concatenated the V. Just execute the Main sub query, so it will be easy to understand for you.


SELECT H.CustomerId, SUM(D.OrderQty) AS TotalQty, SUM(D.LineTotal) AS TotalVal, 'Q'+CONVERT(VARCHAR(4),H.OrderDate,120) AS QYear, 'V'+CONVERT(VARCHAR(4),H.OrderDate,120) AS VYear FROM Sales.SalesOrderDetail AS D INNER JOIN Sales.SalesOrderHeader AS H ON D.SalesOrderId = H.SalesOrderId WHERE D.ProductId=771 AND H.OrderDate >='20010101' AND H.OrderDate <'20030101' GROUP BY H.CustomerId, CONVERT(VARCHAR(4),H.OrderDate,120)

SELECT     H.CustomerId,
                SUM(D.OrderQty) AS TotalQty,
                SUM(D.LineTotal) AS TotalVal,
                'Q'+CONVERT(VARCHAR(4),H.OrderDate,120) AS QYear,
                'V'+CONVERT(VARCHAR(4),H.OrderDate,120) AS VYear
FROM Sales.SalesOrderDetail AS D INNER JOIN
Sales.SalesOrderHeader AS H ON D.SalesOrderId = H.SalesOrderId
WHERE D.ProductId=771
                AND H.OrderDate >='20010101'
                AND H.OrderDate <'20030101'
GROUP BY H.CustomerId,
                CONVERT(VARCHAR(4),H.OrderDate,120)

Now we have 2 columns, with different values, and we can use them in different PIVOT with same effect, and that’s what I have done in my 1st query.

Here is a CASE expression version of same query, which gives much better performance if you scale it for large amount data.


USE AdventureWorks GO SELECT H.CustomerId, SUM(CASE YEAR(H.OrderDate) WHEN 2001 THEN D.OrderQty END) AS Qty2001, SUM(CASE YEAR(H.OrderDate) WHEN 2002 THEN D.OrderQty END) AS Qty2002, SUM(CASE YEAR(H.OrderDate) WHEN 2001 THEN D.LineTotal END) AS Val2001, SUM(CASE YEAR(H.OrderDate) WHEN 2002 THEN D.LineTotal END) AS Val2002 FROM Sales.SalesOrderDetail AS D INNER JOIN Sales.SalesOrderHeader AS H ON D.SalesOrderId = H.SalesOrderId WHERE D.ProductId=771 AND H.OrderDate >='20010101' AND H.OrderDate <'20030101' GROUP BY H.CustomerId ORDER BY H.CustomerId GO

USE AdventureWorks
GO
SELECT     H.CustomerId,
               SUM(CASE YEAR(H.OrderDate)
                      WHEN 2001
                      THEN D.OrderQty
                      END) AS Qty2001,
               SUM(CASE YEAR(H.OrderDate)
                      WHEN 2002
                      THEN D.OrderQty
                      END) AS Qty2002,
              SUM(CASE YEAR(H.OrderDate)
                     WHEN 2001
                     THEN D.LineTotal
                     END) AS Val2001,
              SUM(CASE YEAR(H.OrderDate)
                     WHEN 2002
                     THEN D.LineTotal
                     END) AS Val2002
FROM Sales.SalesOrderDetail AS D INNER JOIN
Sales.SalesOrderHeader AS H ON D.SalesOrderId = H.SalesOrderId
WHERE D.ProductId=771
          AND H.OrderDate >='20010101'
          AND H.OrderDate <'20030101'
GROUP BY H.CustomerId
ORDER BY H.CustomerId
GO

You can test the performance of both queries. If you want to scale it for larger data you can remove the WHERE conditions added by me. Total execution time for CASE query is almost half to that of PIVOT query.

18 comments:

Brad Schulz6:57:00 PM
Hi Mangal...

Here's another approach using the PIVOT operator, though it is more costly than the example you posted (sorry if the formatting doesn't come through correctly):

with CustTotals as
(
select CustomerID
,str(year(H.OrderDate),4) as Yr
,sum(D.OrderQty) as TotalQty
,sum(D.LineTotal) as TotalVal
from Sales.SalesOrderDetail D
join Sales.SalesOrderHeader H
on D.SalesOrderID=H.SalesOrderID
where D.ProductID=771
and H.OrderDate>='20010101'
and H.OrderDate <'20030101'
group by H.CustomerID
,year(H.OrderDate)
)
select
P1.CustomerID,Qty2001,Qty2002,Val2001,Val2002
from
(select CustomerID, PivotKey='Qty'+Yr, TotalQty
from CustTotals) I
pivot (sum(TotalQty) for PivotKey in ([Qty2001],[Qty2002])) P1
join
(select CustomerID, PivotKey='Val'+Yr, TotalVal
from CustTotals) I
pivot (sum(TotalVal) for PivotKey in ([Val2001],[Val2002])) P2
on P1.CustomerID=P2.CustomerID;

--Brad
vipula7:11:00 PM
Hi Mangal,
I liked your second approch with case statement. It is efficient and flexible. I had to group a computaion and couldn't do it without the second approch.
thanks
Vipster
Sheev6:07:00 PM
This was very helpful, thanks
Sheev6:08:00 PM
This was very helpful, thanks
Naomi11:55:00 PM
You can also find my article on 'Dynamic Pivot on multiple columns'
http://blogs.lessthandot.com/index.php/DataMgmt/DataDesign/dynamic-pivot-on-multiple-columns
Naomi11:55:00 PM
You can also see my article on Dynamic Pivot on multiple columns
http://blogs.lessthandot.com/index.php/DataMgmt/DataDesign/dynamic-pivot-on-multiple-columns
Unknown10:01:00 PM
Thanks!, CASE example helped me :)
Unknown10:01:00 PM
Thanks! CASE example helped me :)
- b8:05:00 PM
I appreciate the example. I was hoping to use PIVOT, but I had to group on multiple columns and cases, and this logic sparked my mind to understand how to handle my particular situation. Thanks for posting this alternative that works very well.
- b8:05:00 PM
I appreciate the example. I was hoping to use PIVOT, but I had to group on multiple columns and cases, and this logic sparked my mind to understand how to handle my particular situation. Thanks for posting this alternative that works very well.
Unknown5:04:00 PM
Excellent example, thank you so much
Unknown10:54:00 AM
Saved the day!! Many Thanks!!!!
huwk12:46:00 PM
THANKS!
huwk12:50:00 PM
This comment has been removed by the author.
Unknown10:31:00 PM
Thanks Mangal, This helped a sql beginner get past a seeming impossible task.
Curling Technologies7:38:00 AM
How to make return Zero instead of Null
Curling Technologies7:39:00 AM
how make it to return Zero instead of Null
Unknown9:47:00 AM
You can simply use ISNULL.

SQL Master

Pages

Search This Blog

Pageviews

Ads

Friday, June 26, 2009

PIVOT Multiple Columns

18 comments:

Blog Archive

Labels

About Me

Ads

Followers

Useful