Click here to Skip to main content
Click here to Skip to main content

Calculate aggregates for dynamic columns using UNPIVOT

, 17 Jan 2013
Rate this:
Please Sign up or sign in to vote.
This tip describes how to calculate an aggregate for dynamically defined columns using UNPIVOT clause.

Introduction 

Sometimes an aggregate calculation such as SUM or AVG is needed on dynamically defined columns. For example, if you have several columns in a table, you may need to calculate a SUM for columns 1 and 2 or in another situation for columns 2 and 3.

This can be done in several ways, for example using a cursor, fetching all the values and aggregating the needed result. Another way could be to define a condition for each column if the column is included in the calculation or not. Based on that information, do the aggregation and so on.

This tip demonstrates how this task can be accomplished by using UNPIVOT with dynamically defined set of columns.  

Test data 

First we need some test data. For that let's create a table.

CREATE TABLE TestData (
   RowId int,
   Col1  int,
   Col2  int,
   Col3  int
); 

And populate it with some data.

INSERT INTO TestData VALUES (1, 1, 2, 3);
INSERT INTO TestData VALUES (2, 4, 5, 6);
INSERT INTO TestData VALUES (3, 7, 8, 9);
INSERT INTO TestData VALUES (4, 10, 11, 12);

Intermediate results 

To understand the solution, let's have a look how the data from separate columns is transferred to rows.

For the first situation, values from columns Col1 and Col2 are needed in a single result column in order to calculate a sum. For this, the following statement can be used.

SELECT td2.RowId, td2.ColHeader, td2.ColValues
FROM (SELECT RowId, Col1, Col2 
      FROM TestData) td
UNPIVOT (ColValues FOR ColHeader IN (Col1, Col2)) AS td2; 

The query above selects the values from both Col1 and Col2 and places them in the result column called ColValues. The name of the source column is placed in ColHeader. The RowId is included to show from which row the data is coming from. 

So the result looks like the following

RowId   ColHeader   ColValues
-----   --------    ---------
1       Col1        1
1       Col2        2
2       Col1        4
2       Col2        5
3       Col1        7
3       Col2        8
4       Col1        10
4       Col2        11 

The same query to get values from Col2 and Col3 would look like 

SELECT td2.RowId, td2.ColHeader, td2.ColValues
FROM (SELECT RowId, Col2, Col3 
      FROM TestData) td
UNPIVOT (ColValues FOR ColHeader IN (Col2, Col3)) AS td2; 

And the result 

RowId   ColHeader   ColValues
-----   --------    ---------
1       Col2        2
1       Col3        3
2       Col2        5
2       Col3        6
3       Col2        8
3       Col3        9
4       Col2        11
4       Col3        12 

To calculate the aggregate 

Now, when we have the intermediate results, let's do the aggregate calculation. In order to keep this in a single query, the queries above are used as inline view in the SQL statement 

SELECT SUM(iv.ColValues) AS TotalSum
FROM (SELECT td2.RowId, td2.ColHeader, td2.ColValues
      FROM (SELECT RowId, Col1, Col2 
            FROM TestData) td
      UNPIVOT (ColValues FOR ColHeader IN (Col1, Col2)) AS td2
	  ) iv;  

So the previous example query is just surrounded with an outer SELECT statement and the aggregation is done in the outermost statement. The result from the query above is

TotalSum
--------
48 

The same query for case 2 (columns Col2 and Col3) would be

SELECT SUM(iv.ColValues) AS TotalSum
FROM (SELECT td2.RowId, td2.ColHeader, td2.ColValues
      FROM (SELECT RowId, Col2, Col3 
            FROM TestData) td
      UNPIVOT (ColValues FOR ColHeader IN (Col2, Col3)) AS td2
	  ) iv;

And the result

TotalSum
--------
56 

To dynamically select the columns to include in the calculation

The last step of this tip is to include a variable to define which columns are used for the calculation.  For the first test case the code could look like the following 

DECLARE @columns VARCHAR(1000);
SET @columns = 'Col1, Col2';
 
EXECUTE ('SELECT SUM(iv.ColValues) AS TotalSum
          FROM (SELECT td2.RowId, td2.ColHeader, td2.ColValues
                FROM (SELECT RowId, ' + @columns + ' 
                      FROM TestData) td
                UNPIVOT (ColValues FOR ColHeader IN (' + @columns + ')) AS td2
	       ) iv'); 

So basically the query is just the same. The only difference is that the columns to be unpivoted are included via a variable which of course can be changed at run-time without modifying the skeleton of the statement. The statement itself is executed using EXECUTE statement. So the example above would result to

TotalSum
--------
48 

For the second test case, let's just modify the values for the @columns variable  

...
SET @columns = 'Col2, Col3';
... 

And the result would be

TotalSum
--------
56 

Few things to keep in mind

There are few things that should be observed:

  • Since the SQL statement is embedded inside a string, the syntax isn't verified until the statement is actually executed.
  • The column names are just concatenated to the SQL statement, so if the column names are coming from an outside source (such as parameters to a stored procedure) the input should be verified properly to prevent SQL injections etc. 
  • Even though both examples use two columns, the same code works for unspecified number of columns, including just a single column (have a try only with Col1 Smile | <img src=  )  

The download contains all the statements used in this tip. Have fun. 

History

  • December 28th, 2012: Tip created. 

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

Mika Wendelius
Architect
Finland Finland
I've been a programmer since mid 80's using languages like assembler, C/C++, PL/I (mainframe environment), pascal, VB (I know, I know, no comments please) and C# and utilizing different techniques and tools.
 
However I'm specialized in databases and database modeling. Mostly I have used products like Oracle (from version 6), SQL Server (from version 4.2), DB2 and Solid Server (nowadays an IBM product).
 
For the past 10+ years my main concerns have been dealing with different business processes and how to create software to implement and improve them. At my spare time (what ever that actually means) I'm also teaching and consulting on different areas of database management, development and database oriented software design.

Comments and Discussions

 
GeneralMy vote of 5 PinmemberHernán Hegykozi27-Dec-12 12:47 
GeneralRe: My vote of 5 PinmvpMika Wendelius27-Dec-12 19:03 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Mobile
Web04 | 2.8.140709.1 | Last Updated 17 Jan 2013
Article Copyright 2012 by Mika Wendelius
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid