11,416,572 members (65,469 online)

# Inside Recursive CTEs

, 25 May 2010 CPOL
 Rate this:
A step by step explanation of how a recursive CTE actually works.

## Introduction

The Common Table Expression that was introduced in SQL Server 2005 has long been something of an issue for me. I could reproduce the syntax to get what I wanted, but I couldn't honestly say I understood what was going on there. I always need a mental picture to understand something, and that was just missing.

Recently, I think I figured out a model that helps me work with them and explain how, although different, they're actually analogous to recursion in normal programming.

## Background

### Recursion

Let's start with recursion as a general programming concept. Consider this piece of code:

```public string Rec(int Level) {
if (Level == 0)
return "0";
else
return Level.ToString() + "_" + Rec(Level - 1);
}```

This code repeatedly calls itself until `Level` reaches 0. So this call:

`Rec(3);`

gives the following result:

`3_2_1_0`

And this is roughly what happens during execution:

```RecursionExample(3) =
"3_" + RecursionExample(2) =
"3_2_" + RecursionExample(1) =
"3_2_1_" + RecursionExample(0) =
"3_2_1_0"```

Each time the function gets called, it receives the `Level` from the previous call, turns it into a string that is returned, and passes the `Level` to the next call after subtracting 1.

### Recursion in SQL

Now to SQL. Here's an example of a recursive CTE:

```with cte as (
select    1 id, null ParentId
union all
select    C.Id, C.ParentId
from    SomeTable as C
inner join cte as P -- here it joins against itself
on C.ParentId = P.Id
)```

In its definition, there is a join to itself, making it a recursive CTE. This is the same principle as in the C# example where the function called itself, but with a few twists.

First, it doesn't really join against itself, it joins against the previous pass of itself. Just like the C# example called itself a couple of times and each time getting a different result ("3_", "3_2_" etc.), there are stages in the execution of the CTE. Each time it calls itself, a new stage is created.

Second, SQL is a set-based language. So instead of passing an int to the next call and receiving a string, each pass in a CTE takes a set, and returns a set (of the same columns). Here we're joining to the previous pass, and we're union-ing the result of the current pass to the result of the previous pass. Each pass in the CTE results in a set of rows, and the next call to the CTE takes that set as its input.

But only the rows in the current pass are used in the next pass, like the int that is decremented in the C# code and passed to the next call. The union with all previous passes is only after that set is returned, like the string that was added to in C#. It's important to understand the separation between those two.

Last, it kind of works the other way around from C#. In the C# example, each time the function called itself, it passed what it needed to work, Level - 1, to it. In SQL, we do not actually pass in a variable, but we work against the previous pass of the CTE. It kind of looks back.

Let's clarify that with a simple example like the one we did in C#. Take this table that represents a small parent child tree:

ID Parent ID
1 null
2 1
3 1
4 2

When we would apply the above CTE to it, these would be the steps in the execution, analogous to what we did with C#:

```1, null
2, 1
3, 1
4, 2```

The first pass is where we selected (1, null) the first line in the CTE.

Then comes the union that makes a new call to the CTE. In this first pass, we only had one row, so that's what the second pass 'sees' to join against. This is the key to understanding recursive CTEs: each pass in the CTE only 'sees' the results of the previous pass.

So the second pass joins against that single row and returns (2, 1) and (3, 1) because they have Parent ID = 1.

Now another CTE calls itself again to join rows from SomeTable against these two rows, and out comes (4, 2).

One more join is made, but no more rows are returned, so the recursion stops and the CTE returns all five records.

### Simulating a CTE with old-school SQL

This behaviour can be simulated without using a CTE, that would look like this:

```-- create table to hold results
declare @cte table (
id int,
ParentId int,
Pass int
)
-- declare variable to hold pass
declare @Pass int
set    @Pass = 0

-- insert initial set
insert    @cte(id, ParentId, Pass)
values    (1, 0, @Pass)

while     @@ROWCOUNT > 0 -- keep going until there are no more rows added
begin
-- increment pass
set @Pass = @Pass + 1

-- insert next pass
insert    @cte (id, ParentId, Pass)
select    C.Id, C.ParentId, @Pass
from    @t C
inner join @cte P
on C.ParentId = P.id
and P.Pass = @Pass - 1 -- only join against rows from
--the previous pass, just like the CTE
end
select * from @cte -- including the Pass column here to clear things up.
-- The 'real' CTE does not include it (probably becuase it doesn't have
-- to use it under the covers)```

`WHILE @@ROWCOUNT < 0` keeps repeating that section until no more rows are affected. Each time a new set is added to the resultset, they are kept separate by adding the pass depth to them. That pass depth can be used in the join to identify only the rows that where added in the previous pass.

### Simulating old-school with a CTE

The `Pass` column from the previous example can be easily added to the CTE to help a user trace the separate steps in the creation of the resultset. The following listing does just that.

```with cte as (
select    1 id, 0 ParentId, 0 Pass -- start with Pass = 0
union all
select    C.Id,
C.ParentId,
P.Pass + 1    -- Increment Pass column with each pass by adding 1
-- to the previous value (note that *P*.Pass was used to add to)
from    @t as C
inner join cte P on C.ParentId = P.Id
)
select * from cte```

We start with a single start row, and give it `Pass = 0`. Then we join against it to find the children and add them with `Pass = 1`, and so on until no more rows are added.

## Using the code

A sample code is included for the user to see it for himself. Just open it up and execute it.

## Points of interest

When I understood that each pass only looks back at the previous pass, the CTE was demystified for me and I was a lot more comfortable using them in production. Before that, I was just repeating code that I did not fully grasp. The passes provide me with a mental picture that helps me work with CTEs. I hope it will help you to do the same. CTEs lead to much clearer code, are easier to get right, and are much easier to understand for a future developer on your code (once he has read this article, of course

I believe they are a very welcome addition to SQL because they let you specify what you want, instead of having to tell SQL how it should go about getting that result. SQL has been one of the most successful declarative languages for a long time, and CTEs let you specify your intent in a much clearer way than was previously possible. Comparing the two samples should prove that point.

## History

• May 25, 2010: Version 1.0.

## Share

Leaseplan Corporation
Netherlands
Gert-Jan is a Senior Quantitative Risk Manager at Leaseplan Corporation. In that job he doesn't get to code much he does these little projects to keep his skills up and feed the inner geek.

 First Prev Next
 My vote of 5 sarathkumarnallathambi7-Dec-12 20:49 sarathkumarnallathambi 7-Dec-12 20:49
 While Loop Recursion sarathkumarnallathambi7-Dec-12 3:06 sarathkumarnallathambi 7-Dec-12 3:06
 My vote of 5 Michael Sandler3-Sep-12 23:57 Michael Sandler 3-Sep-12 23:57
 Nice explanation Perry226-Mar-12 16:51 Perry2 26-Mar-12 16:51
 Question pagolu.satheeshp26-Oct-10 5:34 pagolu.satheeshp 26-Oct-10 5:34
 Re: Question gjvdkamp26-Oct-10 7:22 gjvdkamp 26-Oct-10 7:22
 Hi Satheesh, The columns are id, parent. The 1 for both rows is the parent of those rows, both have node 1 as parent. That row was returned in the anchor of the cte. The next pass of the cte will look for rows the have either 2 of 3 as parent, that will add row 4 because it has 2 as parent. So there will not be duplicate rows in this example. Did that explain it? Regards GJ
 Last Visit: 31-Dec-99 19:00     Last Update: 27-Apr-15 21:10 Refresh 1