Introduction
In this article, the creation of dynamic pivots are explained. It starts with a simple pivot. The second step is to add lines for calculating totals. In this case, you can see that TransAct SQL has event handlers and in combination with the WITH CUBE
command, they come in very handy. Unfortunately the PIVOT
command in SQL Server (2005 and up) works with named column names. To make it dynamic, a little programming has to be done.
Background
One of my customers always loaded the delta of data in our system until he found out that it was possible to add each month all data into our system. We wanted to see how big the system growth became. The month he was starting to load full files instead of delta files was significantly showing in the result. When we created an Excel graph of the data, everyone was astonished. By adding totals, a simple management report was created which very simply told our customer of what our system is capable of.
Using the Code
The database used for this article is AdventureWorks
which can be found here. The queries attached to this project can be run in SQL Server Management Studio (SSMS) for SqlServer 2008 R2. The Adventureworks
database name is AdventureWorksDW2008R2
.
The Start - Where It Mostly Ends
Most developers know some SQL and when they have to create a query showing the relevance between two things, a query like this comes up:
SELECT var1, var2, count(var2) from table1 group by var1, var2
If we look at the AdventureWorksDW2008R2
database, the table FactSalesQuota
could lead to the next question. Based on the fields CalendarYear
, CalendarQuarter
and SalesAmountQuota
, it is possible to investigate which Amount
was sold per quarter per year. The query in start could be:
SELECT [CalendarYear],[CalendarQuarter],[SalesAmountQuota] _
FROM [AdventureWorksDW2008R2].[dbo].[FactSalesQuota]
The result is a long list that is being 'improved' with a SUM()
function and a GROUP BY
. But the 3 columns don't say a thing. The relevance cannot be seen.
How to Start Getting Dynamic
The 'improved' query is the start for dynamic pivoting. If we want to see an effective result between CalendarYear
and CalendarQuarter
, we have four steps to take:
- Store the result of the query in a temporary table.
- Find the unique values in
CalendarQuarter
columns and set them as columnnames in a varchar
. - Create the pivot command (based on the pivot command using the defined columnames) in a
varchar
. - Execute the created pivot command.
- I said four steps, but step 5 is often forgotten: DROP YOUR TEMPORARY TABLE.
And Now in Code
Start setting the queryresult in a temporary table:
SELECT [CalendarYear],[CalendarQuarter],SUM([SalesAmountQuota]) as SalesAmount
INTO #tempPivotTable
FROM [AdventureWorksDW2008R2].[dbo].[FactSalesQuota]
GROUP BY [CalendarYear],[CalendarQuarter]
Then, create the columnnames
based on the unique values in the CalendarQuarter
column. Each value should be cast as type varchar
and placed in brackets and your column is ready! To create the @Columns
mostly it is seen to do @columns = @columns + '[' + '.....' + ']'
. The danger in this code is that when '[' + '.....' + ']
' fails, you don't know what @columns
will be. Therefore, I prefer the use of the STUFF
command. It is one function and that result is placed in my variable.
DECLARE @columns VARCHAR(8000)
SELECT @columns = STUFF(( SELECT DISTINCT TOP 100 PERCENT
'],[' + cast([CalendarQuarter] as varchar)
FROM #tempPivotTable AS t2
ORDER BY '],[' + cast([CalendarQuarter] as varchar)
FOR XML PATH('')
), 1, 2, '') + ']'
Then create the query to collect the data for the pivot. Remember that you need an aggregate function to have the pivot work. In this case, we can use both MIN()
or MAX()
, since it is one amount per quarter per year.
DECLARE @query VARCHAR(8000)
SET @query = '
SELECT *
FROM #tempPivotTable
PIVOT
(
MAX(SalesAmount)
FOR [CalendarQuarter]
IN (' + @columns + ')
)
AS p '
Execute the created query:
EXECUTE(@query)
Drop your temporary table. Otherwise, you cannot run this query twice in a row.
DROP TABLE #tempPivotTable
In the project image, you can see the result. If you see the value NULL
in a field, you know the combination is not available.
Be a Dynamic-pivot-pro
Nice to have the pivot, but the next question will come from your manager. Do we sell more per year and do we sell more per quarter based over the years. It is time to do some counting over the rows while the query runs. We need the WITH CUBE
command in our basic query first.
SELECT [CalendarYear],[CalendarQuarter],SUM([SalesAmountQuota]) as SalesAmount
INTO #tempPivotTable
FROM [AdventureWorksDW2008R2].[dbo].[FactSalesQuota]
GROUP BY [CalendarYear],[CalendarQuarter]
WITH CUBE
If you run the query, a line is added at the top starting with NULL
and then the values. It represents the summed value per quarter over the years. But the first part of the question isn't answered: Do we sell more per year? At this point, we do need event handlers in transact SQL. Relax! I will help you through. The event handler needed is 'grouping'. When the group by
is executed in the SQL server, you want to show 'Total
' in the first column, otherwise the calenderyear
. So we also need the 'case when
' function to have this 'if
' executed. Your basic query changes:
SELECT CASE WHEN GROUPING(cast([CalendarYear] as varchar)) = 1
THEN 'Total'
ELSE cast([CalendarYear] as varchar)
END
as [CalendarYear],
[CalendarQuarter],SUM([SalesAmountQuota]) as SalesAmount
INTO #tempPivotTable
FROM [AdventureWorksDW2008R2].[dbo].[FactSalesQuota]
GROUP BY cast([CalendarYear] as varchar),[CalendarQuarter]
WITH CUBE
In this case, 'Total
' is a varchar
and CalendarYear
an integer
. That is why the cast to varchar
is added (also in the group by
). That is why we are a bit lucky! If calendaryear
had another value like 'year of the name_an_animal
' as starting point, the Total
row would START the result. A solution is given at the end of this article. If a Total
column needs to be added, it should be in the collection of CalendarQuarter
. So we also need the GROUPING
for CalendarQuarter
. The basic query expands again:
SELECT CASE WHEN GROUPING(cast([CalendarYear] as varchar)) = 1
THEN 'Total'
ELSE cast([CalendarYear] as varchar)
END
as [CalendarYear],
CASE WHEN GROUPING(cast([CalendarQuarter] as varchar)) = 1
THEN 'Total'
ELSE cast(CalendarQuarter as varchar)
END
as [CalendarQuarter],
SUM([SalesAmountQuota]) as SalesAmount
INTO #tempPivotTable
FROM [AdventureWorksDW2008R2].[dbo].[FactSalesQuota]
GROUP BY cast([CalendarYear] as varchar), cast([CalendarQuarter] as varchar)
WITH CUBE
If you would run the total query, you will see that:
- Quarter 3 is the best selling quarter
- We sell more each year, and the latest year isn't finished yet but gives a clue
- Your manager will be happy
- In one blink of an eye, you see the value of your data
Total query is:
SELECT CASE WHEN GROUPING(cast([CalendarYear] as varchar)) = 1
THEN 'Total'
ELSE cast([CalendarYear] as varchar)
END
as [CalendarYear],
CASE WHEN GROUPING(cast([CalendarQuarter] as varchar)) = 1
THEN 'Total'
ELSE cast(CalendarQuarter as varchar)
END
as [CalendarQuarter],
SUM([SalesAmountQuota]) as SalesAmount
INTO #tempPivotTable
FROM [AdventureWorksDW2008R2].[dbo].[FactSalesQuota]
GROUP BY cast([CalendarYear] as varchar), cast([CalendarQuarter] as varchar)
WITH CUBE
DECLARE @columns VARCHAR(8000)
SELECT @columns = STUFF(( SELECT DISTINCT TOP 100 PERCENT
'],[' + cast([CalendarQuarter] as varchar)
FROM #tempPivotTable AS t2
ORDER BY '],[' + cast([CalendarQuarter] as varchar)
FOR XML PATH('')
), 1, 2, '') + ']'
DECLARE @query VARCHAR(8000)
SET @query = '
SELECT *
FROM #tempPivotTable
PIVOT
(
MAX(SalesAmount)
FOR [CalendarQuarter]
IN (' + @columns + ')
)
AS p '
EXECUTE(@query)
DROP TABLE #tempPivotTable
Solution for Total Row at the Bottom
The point is that 'Total
' starts with a 't
' and that 'u,v,w,x,y,z
' can lead to a total row in the middle of your result set. In that case, you should look at your used collation in the database. In my case, it is: Latin1_General_CI_AS
. If you would look here, you could see that character 161 is after the 'z'. I know the word '¡Total
' now looks ugly but the row and/or column is clearly identifiable. The effect however is that the second column and the first row represent the total values. It looks more ugly, but if you start the 'Total
' with 'ZZ
', you can always end total rows and columns.
Final Thoughts
If instead of quarter
, the year
was used for columnnames
, you really would see the power of this query. If data for the year 2009 was added to the table, the year would immediately popup as a new column in the resultset. That makes these kind of queries pretty handy for:
- Management reports
- Quick scan of values in case of trouble
- Getting the real value of your data visible
History
- 24-5-2011: 1.0 Initial
- 24-5-2011: 1.1 Fixed typos because I started with a pivot on another table
I started in the eightees with Basic and QuickBasic. When the ninetees started dBase was my thing. Close to the millenium years I swapped to C++ and since 2003 it is only programming in c#.
There is one thing about me.
Not that kind of OO guy and definately not an ORM-fan. I know a database to well to give the importance of data out in the hands of a tool.