12,395,645 members (58,689 online)
alternative version

235.9K views
161 bookmarked
Posted

# Trees in SQL databases

, 22 Sep 2004
 Rate this:
How to get all power of trees in SQL

## Introduction

Sooner or later in your database project you will need to store hierarchical information. For instance enterprise structure, categories of merchandise, product catalogs, or folders with documents all are good nominees for such structures. It is easy of course to implement that kind of store with self-related table. Problems arrive once you need to answer hierarchical questions. For example what if we have an enterprise structure and need to know how many employees reporting to manager «X»? There have been already some solutions of the problem.

See:

Joe Celko’s book and articles also covered it.

The above techniques are good when you only reading the data, but when you need to modify the tree you will have considerable performance deprivation.

I am going to introduce method that has similar performance for reading but better performance for modification of the tree.

## The Idea

Suppose we have the following table:

Where `NodeId `is the primary key, `ParentId `is the foreign key, and `NodeName `is some extra field. In order to provide simple examples let’s assume that `NodeName `has unique constraint on it.

For all examples below I will use the following tree:

Let’s create another table having two relationships with the Node table:

Where both `NodeId `and `ParentId `are foreign keys referencing Node table.

Each record in the Tree table means that the Node pointed by `Tree.NodeId `has the ancestor pointed by `Tree.ParentId`. By ancestor I understand any node lying on the path from the node to the root of the tree, i.e. direct parent or grand parent or grand-grand parent and so on.

Now let’s ensure one simple invariant: all ancestors of each node from Node table are enumerated in Tree table.

For example for node 4 they are going to be 2 and 1.

The immediate result of this invariant is: this is very easy to get full path from the node to the root of the tree - just select all records from the Tree table related to the seeking node. Let’s write down this select. Suppose we need to find path from node 7 to the root.

### Query 1:

```select p.*
from Node n, Tree t, Node p
where n.NodeName=7
and n.NodeId = t.NodeId
and t.ParentId = p.NodeId```

It is a pretty strait forward select, isn’t it?

The second result of the above invariant is: we can select all descendants of the node. This is because we are enlisting all ancestors for every node from Node table in the Tree table including of course all ancestors of the seeking node. In order to select entire sub tree of the node we just query for all nodes that have the seeking node as their ancestor.

### Query 2:

```select c.*
from Node n, Tree t, Node c
where n.NodeName=2
and n.NodeId = t.ParentId
and t.NodeId = c.NodeId```

This is as simple as that. Note that the two queries are opposite in the direction passing through Tree table.

## Implementation

Before going further let me make two notes.

# 1: from my experience it is very convenient to have one more field in the Tree table representing the distance on the tree between nodes pointed by NodeId and ParentId, in another words - level of ancestor. So the actual structure includes Level field:

# 2: in most probable cases we need sub tree of the node or path from the node to the root including the node itself. So it would be useful having in the Tree table the node itself as its own ancestor of the level 0.

It is possible to implement the above invariant either as stored procedures or as triggers. I personally prefer trigger one.

Let’s start with insert trigger for Node table. I will be using MS SQL server.

First of all the trigger must create a record to satisfy # 2. This can be done by the following statement:

```insert into Tree(NodeId, ParentId, Level)
select NodeId, NodeId, 0
from inserted
```

Now we are ready to add the list of all ancestors of the inserted nodes. Take in consideration that the parent of each inserting node already has enumerated all his ancestors, so we can use them as a foundation. But we need to replace parent’s NodeId with the id of the new node and increment the level.

```insert into Tree(NodeId, ParentId, Level)
select n.NodeId, t.ParentId, t.Level + 1
from inserted n, Tree t
where n.ParentId = t.NodeId
```

So the overall trigger will be:

```create trigger NodeInsert on Node for insert as
begin
set nocount on

insert into Tree(NodeId, ParentId, Level)
select NodeId, NodeId, 0
from inserted

insert into Tree(NodeId, ParentId, Level)
select n.NodeId, t.ParentId, t.Level + 1
from inserted n, Tree t
where n.ParentId = t.NodeId
end
go```

Now it is time to define update trigger which takes care of changing node’s parent. The goal of update trigger is to delete all obsolete records from Tree table and create new ones. In order to minimize changes been made by this trigger let’s reuse the information already stored in the Tree table in a manner we did in insert trigger. What is not going to be affected by changing node’s parent?

Apparently for updated node itself it is persistent only one record that satisfies #2. And for descendants they are going to be permanent all records about ancestors below updated node including this node. While every ancestor above the updated node will obsolete.

For example if we are changing parent for node 4 then for nodes 5, 6, and 7 will be changed only ancestors above node 4 i.e. 1 and 2.

On the first step we back up all descendants of updated nodes in the following table:

```declare @child table(NodeId int, Level int)
```

We insert all descendants’ primary keys and the level of each of them relatively to updated nodes. Condition t.Level > 0 allows to exclude updated node from been inserted to the @child table.

```insert into @child(NodeId, Level)
select t.NodeId, t.Level
from inserted n, Tree t
where n.NodeId = t.ParentId and t.Level > 0
```

The second step deletes all obsolete rows for all descendants:

```delete Tree
where
Tree.NodeId in(select NodeId from @child)
and Tree.ParentId in(
select t.ParentId
from inserted n, Tree t
where n.NodeId = t.NodeId and t.Level > 0
)
```

Condition t.Level > 0 excludes updated nodes themselves from the deletion.

The third step removes obsolete records for updated nodes:

```delete Tree
where Tree.NodeId in(select NodeId from inserted)
and Tree.Level > 0
```

On the next two steps we will populate new information in the Tree table.

So the forth step is going to be:

```insert into Tree(NodeId, ParentId, Level)
select n.NodeId, t.ParentId, t.Level + 1
from inserted n, Tree t
where n.ParentId = t.NodeId
```

This is exactly how we did it in the insert trigger.

And finally the fifth step:

```insert into Tree(NodeId, ParentId, Level)
select c.NodeId, t.ParentId, t.Level + c.Level
from inserted n, Tree t, @child c
where n.NodeId = t.NodeId and t.Level > 0
```

It might look strange that there is no any joins with @child table in this statement. But this is what we really need here because we are going to repeat ancestor’s information for each child. Also note that we are adding child’s level stored in the @child table rather that just 1.

The entire trigger looks like:

```create trigger NodeUpdate on Node for update as
if update(ParentId)
begin
set nocount on

declare @child table(NodeId int, Level int)

insert into @child(NodeId, Level)
select t.NodeId, t.Level
from inserted n, Tree t
where n.NodeId = t.ParentId and t.Level > 0

delete Tree
where
Tree.NodeId in(select NodeId from @child)
and Tree.ParentId in(
select t.ParentId
from inserted n, Tree t
where n.NodeId = t.NodeId and t.Level > 0
)

delete Tree
where Tree.NodeId in(select NodeId from inserted) and Tree.Level > 0

insert into Tree(NodeId, ParentId, Level)
select n.NodeId, t.ParentId, t.Level + 1
from inserted n, Tree t
where n.ParentId = t.NodeId

insert into Tree(NodeId, ParentId, Level)
select c.NodeId, t.ParentId, t.Level + c.Level
from inserted n, Tree t, @child c
where n.NodeId = t.NodeId and t.Level > 0
end
go```

As you can see it will be executed only if `ParentId `was updated and all other updates of `Node `table will be filtered out by the very first if statement.

## Restrictions of the implementation

If you are going to insert or update more than one Node at a time please make sure you are not doing this for more then one level of the tree. From the practical point of view this is not a strong restriction at all.

## Remaining steps

In order to make the functionality completed we need the ability to clean the entire Tree table and populate it from the scratch. It is probably a good idea to create a stored procedure for this purpose, and I am not going to take away from you the pleasure to write down such procedure in you own.

## Examples

You can find SQL statements for creating both tables, both triggers, and statements that populating Node table in the attached file. It also contains Query 1 and 2 and statements to change node’s parent.

A list of licenses authors might use can be found here

## Share

 Software Developer United States
No Biography provided

## You may also be interested in...

 First PrevNext
 My vote of 5 Ben Kotvis29-May-12 3:50 Ben Kotvis 29-May-12 3:50
 order by [modified] virtual.aussie10-Dec-10 10:05 virtual.aussie 10-Dec-10 10:05
 chain business app. rule ristriction to add less than 4 child for one parent node Ankush Komarwar26-Feb-07 5:44 Ankush Komarwar 26-Feb-07 5:44
 Re: chain business app. rule ristriction to add less than 4 child for one parent node Eugene Lepekhin26-Feb-07 6:44 Eugene Lepekhin 26-Feb-07 6:44
 Moving a node below its child fredjf28-Nov-06 19:34 fredjf 28-Nov-06 19:34
 Re: Moving a node below its child Eugene Lepekhin29-Nov-06 9:04 Eugene Lepekhin 29-Nov-06 9:04
 Read SQL for Smarties Instead!!!! Dale Thompson17-Jul-06 10:14 Dale Thompson 17-Jul-06 10:14
 Remaining Steps - Solution to Rebuild Tree CmiCode9-Mar-06 6:17 CmiCode 9-Mar-06 6:17
 Effective way to expand data tree using SQL Server ! dgoyani20-Oct-05 19:48 dgoyani 20-Oct-05 19:48
 Re: Effective way to expand data tree using SQL Server ! Eugene Lepekhin25-Oct-05 14:50 Eugene Lepekhin 25-Oct-05 14:50
 Updates isaunder126-Aug-05 3:39 isaunder1 26-Aug-05 3:39
 Re: Updates Eugene Lepekhin26-Aug-05 10:56 Eugene Lepekhin 26-Aug-05 10:56
 Re: Updates isaunder129-Aug-05 10:30 isaunder1 29-Aug-05 10:30
 Great ... But Abukhait3-Jul-05 10:15 Abukhait 3-Jul-05 10:15
 Re: Great ... But Tzoockee29-Aug-05 2:11 Tzoockee 29-Aug-05 2:11
 Re: Great ... But Shakalama29-Aug-08 7:54 Shakalama 29-Aug-08 7:54
 Subtree matching? Ryan L. Schneider20-Jun-05 6:31 Ryan L. Schneider 20-Jun-05 6:31
 What about deleting node from tree? JoeDeVise6-Jun-05 2:23 JoeDeVise 6-Jun-05 2:23
 Re: What about deleting node from tree? Eugene Lepekhin6-Jun-05 9:30 Eugene Lepekhin 6-Jun-05 9:30
 Re: What about deleting node from tree? lgoss8-Dec-05 5:25 lgoss 8-Dec-05 5:25
 Re: What about deleting node from tree? fredjf28-Nov-06 19:28 fredjf 28-Nov-06 19:28
 Little bug in attached SQL-File Joerg Szepan14-Nov-04 22:25 Joerg Szepan 14-Nov-04 22:25
 Re: Little bug in attached SQL-File Eugene Lepekhin16-Nov-04 12:47 Eugene Lepekhin 16-Nov-04 12:47
 Re: Little bug in attached SQL-File Joerg Szepan16-Nov-04 19:47 Joerg Szepan 16-Nov-04 19:47
 Sql DB and Tree path elinfo2-Oct-04 5:16 elinfo 2-Oct-04 5:16
 Last Visit: 31-Dec-99 18:00     Last Update: 23-Jul-16 23:27 Refresh 12 Next »