Pivot Large Table With Dynamic Columns

In general, it is recommended to use SUM(CASE) GROUP BY method to pivot rows into columns. For details why recommended such, please refer to http://koncordpartners.blogspot.com/2010/03/pivot-turning-rows-into-columns.html. However, in two circumstances SUM(CASE) GROUP BY method does not work:

1. The column numbers are too many. By using SUM(CASE) GROUP BY method, one must know every column’s name.

2. The column names are unknown when generate queries. Usually this could be happened with column names are result from another SELECT statement.

In fact, these two circumstances do always emerge together. In this situation, built-in PIVOT function in Oracle and SQL Server is the only option. In Ms Access, it is TRANSFORM, which is even better in comparison to Oracle and SQL Server, because it accept the list of column names directly from the SELECT statement. Following example is using T-SQL, and the scenario is to draw a matrix table summarizing the total air ambulance flights between airports within a state for past ten year. The involved records are around 500,000.

The PIVOT function is easy:

SELECT *
FROM
(
SELECT sendport AS Sending_Airport
, receiveport AS Receiving_Airport
, 1 AS InterimCounter
FROM dw_Legs
) AS Inner_Q
PIVOT
(
SUM(InterimCounter)
FOR Receiving_Airport IN ([The list of receiving airport])
) AS Pivot_Q

The problem is “The list of receiving airport” must be selected from airport table, and very unfortunately, T-SQL does not accept it as an outcome of SELECT statement. So, dynamic query is the only option left. We human just make our life so miserable by disallow this and disallow that.

Another issue is not all airports receive planes from every rest airports, which means “The list of receiving airport” should not be directly from airport table, rather is should be from the intermediate queries. So, for presentation purpose, a temporary table is the best solution, which is also in accordance with the design of SQL Server.

Step 1

CREATE VIEW V_Airport_Matrix
AS
SELECT sendport AS Sending_Airport
, receiveport AS Receiving_Airport
, 1 AS InterimCounter
FROM dw_Legs
GO

Step 2

Prepare dynamic queries:

DECLARE @columns VARCHAR(MAX)
SELECT @columns = COALESCE(@columns + ',[' + CAST(LTRIM(RTRIM(Receiving_Airport)) AS VARCHAR) + ']', '[' + CAST(LTRIM(RTRIM(Receiving_Airport)) AS VARCHAR)+ ']')
FROM
(
SELECT DISTINCT Receiving_Airport
FROM V_Airport_Matrix
) AS base_Q

Why LTRIM(RTRIM()) is needed? Because the length of the VARCHAR string becomes critical since VARCHAR(MAX) does have limit. And very often, VARCHAR(MAX) does not meet our need here. And, the text, ntext, and image data types are invalid for local variables. Following is the send part of Step 2:

DECLARE @query VARCHAR(MAX)
SET @query =
'
SELECT *
FROM
(
SELECT *
FROM V_Airport_Matrix
) AS Inner_Q
PIVOT
(
SUM(InterimCounter)
FOR Receiving_Airport IN (' + @columns + ')
) AS Pivot_Q
'

What about your query does exceed the VARCHAR(MAX) limit? You will need to shorten the query by cutting them into pieces:

DECLARE @columns VARCHAR(MAX)
SELECT @columns = COALESCE(@columns + ',[' + CAST(LTRIM(RTRIM(Receiving_Airport)) AS VARCHAR) + ']', '[' + CAST(LTRIM(RTRIM(Receiving_Airport)) AS VARCHAR)+ ']')
FROM
(
SELECT DISTINCT Receiving_Airport
FROM V_Airport_Matrix
WHERE Receiving_Airport LIKE 'A%'
OR Receiving_Airport LIKE 'B%'
OR Receiving_Airport LIKE 'C%'
OR Receiving_Airport LIKE 'D%'
OR Receiving_Airport LIKE 'E%'
OR Receiving_Airport LIKE 'F%'
OR Receiving_Airport LIKE 'G%'
OR Receiving_Airport LIKE 'H%'
OR Receiving_Airport LIKE 'I%'
OR Receiving_Airport LIKE 'J%'
OR Receiving_Airport LIKE 'K%'
OR Receiving_Airport 'L%'
) AS base_Q

Then, execute one by one before union together.

Step 3

To execute the queries:

EXECUTE(@query)


http://blog-mstechnology.blogspot.com/2010/06/t-sql-pivot-operator-with-dynamic.html

No comments:

Post a Comment

Labels