Query Hierarchical DataTable
This tip shows one way to query hierarchical data from a DataTable by using an AsTree() method.
Introduction
Every once in a while, you may need to query data from a DataTable
which has a hierarchical structure. In other words, the DataTable
contains a foreign key column which points to itself thus creating a self-referencing structure. A typical example is an organization chart where employees have a reference to the manager.
In order to iterate the hierarchical structure correctly, you need to have some kind of recursive function. For this, I decided to create a new method called AsTree
which would return the selected portion from the tree.
Implementing AsTree()
Before going to the actual implementation, let’s have a look at the return values. The AsTree
method returns a set of TreeItem
objects. A tree item looks as follows:
/// <summary>
/// Class definition for a single tree item
/// </summary>
public class TreeItem {
/// <summary>
/// Zero based index for defining the level of the current item
/// </summary>
public int Level { get; set; }
/// <summary>
/// The data row for the item
/// </summary>
public DataRow Row { get; set; }
/// <summary>
/// The parent data row
/// </summary>
public DataRow ParentRow { get; set; }
}
' Class definition for a single tree item
Public Class TreeItem
' Zero based index for defining the level of the current item
Public Property Level As Integer
' The data row for the item
Public Property Row As DataRow
' The parent data row
Public Property ParentRow As DataRow
End Class
The Row
property contains the actual row from the tree while the ParentRow
contains its parent DataRow
(if any). The Level
property contains the depth of the row in the tree calculated from the starting point.
The AsTree
extension method is very simple:
/// <summary>
/// Returns a HashSet containing all the elements in the path
/// </summary>
/// <param name="source">Source for the data rows</param>
/// <param name="parentColumnName">Column name that acts as a parent key</param>
/// <param name="childColumnName">Column name that acts as a foreign key</param>
/// <param name="startRow">Data row to start building the tree from</param>
/// <param name="direction">Direction of traverse</param>
/// <returns>The hash set containing the tree for the parent row provided</returns>
public static HashSet<TreeItem> AsTree(this EnumerableRowCollection<DataRow> source,
string parentColumnName,
string childColumnName,
DataRow startRow,
TraverseDirection direction = TraverseDirection.FromParentToChild) {
string actualChildColumn;
string actualParentColumn;
HashSet<TreeItem> resultSet = new HashSet<TreeItem>();
// Add the start row to the set
resultSet.Add(new TreeItem() {
Level = 0,
ParentRow = null,
Row = startRow
});
// Decide the order of the comparison based on the traverse direction
if (direction == TraverseDirection.FromParentToChild) {
actualParentColumn = parentColumnName;
actualChildColumn = childColumnName;
} else {
actualParentColumn = childColumnName;
actualChildColumn = parentColumnName;
}
// Recursively fill the tree
return DataTableTree.FillTree(source, resultSet, actualParentColumn, actualChildColumn,
startRow, 1);
}
' <summary>
' Returns a HashSet containing all the elements in the path
' </summary>
' <param name="source">Source for the data rows</param>
' <param name="parentColumnName">Column name that acts as a parent key</param>
' <param name="childColumnName">Column name that acts as a foreign key</param>
' <param name="startRow">Data row to start building the tree from</param>
' <param name="direction">Direction of traverse</param>
' <returns>The hash set containing the tree for the parent row provided</returns>
<Extension()>
Public Function AsTree(source As EnumerableRowCollection(Of DataRow), _
parentColumnName As String, _
childColumnName As String, _
startRow As DataRow, _
Optional direction As DataTableTree.TraverseDirection =
DataTableTree.TraverseDirection.FromParentToChild) As HashSet(Of DataTableTree.TreeItem)
Dim actualChildColumn As String
Dim actualParentColumn As String
Dim resultSet As HashSet(Of DataTableTree.TreeItem) = New HashSet(Of DataTableTree.TreeItem)()
' Add the start row to the set
resultSet.Add(New DataTableTree.TreeItem() With {
.Level = 0,
.ParentRow = Nothing,
.Row = startRow
})
' Decide the order of the comparison based on the traverse direction
If (direction = DataTableTree.TraverseDirection.FromParentToChild) Then
actualParentColumn = parentColumnName
actualChildColumn = childColumnName
Else
actualParentColumn = childColumnName
actualChildColumn = parentColumnName
End If
' Recursively fill the tree
Return DataTableTreeModule.FillTree(source, resultSet, actualParentColumn, actualChildColumn,
startRow, 1)
End Function
The method uses parameters as follows:
parentColumnName
is theDataColumn
name which acts as a parent keychildColumnName
is the foreign key column referencing to the parent columnstartRow
is the initial row from which the tree is investigateddirection
defines the traverse direction, either top-down or bottom-up. The default is top-down.
The AsTree
method simply adds the first element into the HashSet
and then, based on the direction
, decides which column acts as a parent and which one is the child. The method has an assumption that the foreign key is created using a single column.
The actual iteration is implemented in the following FillTree
method:
/// <summary>
/// Adds all immediate children for a single parent into te given hash set
/// </summary>
/// <param name="source">Source for the data rows</param>
/// <param name="resultSet">Result set to fill</param>
/// <param name="actualParentColumn">Column name that acts as a parent key</param>
/// <param name="actualChildColumn">Column name that acts as a foreign key</param>
/// <param name="startRow">Data row to start building the tree from</param>
/// <param name="level">Current level of the items</param>
/// <returns></returns>
private static HashSet<TreeItem> FillTree(EnumerableRowCollection<DataRow> source,
HashSet<TreeItem> resultSet,
string actualParentColumn,
string actualChildColumn,
System.Data.DataRow startRow,
int level) {
// Build a query for immediate children
var subItems = from item in source
where item.Field<object>(actualChildColumn) != null
&& item.Field<object>(actualChildColumn) != System.DBNull.Value
&& item.Field<object>(actualChildColumn).Equals
(startRow.Field<object>(actualParentColumn))
select new TreeItem() {
Level = level,
Row = item,
ParentRow = startRow
};
// Loop through the items found and add to the result set
foreach (TreeItem subItem in subItems) {
resultSet.Add(subItem);
// Recursively fill children of this sub item
DataTableTree.FillTree(source, resultSet, actualParentColumn, actualChildColumn,
subItem.Row, level + 1);
}
return resultSet;
}
' <summary>
' Adds all immediate children for a single parent into te given hash set
' </summary>
' <param name="source">Source for the data rows</param>
' <param name="resultSet">Result set to fill</param>
' <param name="actualParentColumn">Column name that acts as a parent key</param>
' <param name="actualChildColumn">Column name that acts as a foreign key</param>
' <param name="startRow">Data row to start building the tree from</param>
' <param name="level">Current level of the items</param>
' <returns></returns>
Private Function FillTree(source As EnumerableRowCollection(Of DataRow), _
resultSet As HashSet(Of DataTableTree.TreeItem), _
actualParentColumn As String, _
actualChildColumn As String, _
startRow As System.Data.DataRow, _
level As Integer) As HashSet(Of DataTableTree.TreeItem)
' Build a query for immediate children
Dim subItems = From item In source
Where Not (item.Field(Of Object)(actualChildColumn) Is Nothing) _
AndAlso Not (item.Field(Of Object)(actualChildColumn) Is System.DBNull.Value) _
AndAlso item.Field(Of Object)_
(actualChildColumn).Equals(startRow.Field(Of Object)(actualParentColumn))
Select New DataTableTree.TreeItem() With {
.Level = level,
.Row = item,
.ParentRow = startRow
}
' Loop through the items found and add to the result set
For Each subItem As DataTableTree.TreeItem In subItems
resultSet.Add(subItem)
' Recursively fill children of this sub item
DataTableTreeModule.FillTree(source, resultSet, actualParentColumn, actualChildColumn, _
subItem.Row, level + 1)
Next subItem
Return resultSet
End Function
This method searches for immediate children for a given DataRow
and adds them into the HashSet
. For each child found, the FillTree
method is called recursively in order to add the children of the DataRow
at hand. On each call, the level
is incremented.
Usage Example
Now let’s have a few test runs. Imagine the following organization:
In order to define the structure for the organization, the following data table is created:
// Define the table
hierarchicalTable.Columns.Add("Id", typeof(int));
hierarchicalTable.Columns.Add("Name", typeof(string));
hierarchicalTable.Columns.Add("Title", typeof(string));
hierarchicalTable.Columns.Add("ManagerId", typeof(int));
hierarchicalTable.Constraints.Add(new System.Data.UniqueConstraint(
hierarchicalTable.Columns["Id"], true));
hierarchicalTable.Constraints.Add(new System.Data.ForeignKeyConstraint(
hierarchicalTable.Columns["Id"],
hierarchicalTable.Columns["ManagerId"]));
' Define the table
hierarchicalTable.Columns.Add("Id", System.Type.GetType("System.Int32"))
hierarchicalTable.Columns.Add("Name", System.Type.GetType("System.String"))
hierarchicalTable.Columns.Add("Title", System.Type.GetType("System.String"))
hierarchicalTable.Columns.Add("ManagerId", System.Type.GetType("System.Int32"))
hierarchicalTable.Constraints.Add(New UniqueConstraint( _
hierarchicalTable.Columns("Id"), True))
hierarchicalTable.Constraints.Add(New ForeignKeyConstraint( _
hierarchicalTable.Columns("Id"), _
hierarchicalTable.Columns("ManagerId")))
Each row has an Id
field acting as a primary key and a ManagerId
field which defines the manager for the person. The constraints wouldn't be necessary to use but they ensure that the parent key is present and that it is unique.
Our test data is filled like this:
// Add the data
hierarchicalTable.Rows.Add(1, "Edgar Allan Poe", "CEO", null);
hierarchicalTable.Rows.Add(2, "Blondie (Man with No Name)", "VP Sales", 1);
hierarchicalTable.Rows.Add(3, "Mary Poppins", "Sales Director", 2);
hierarchicalTable.Rows.Add(4, "Frodo Baggins", "Regional Sales Manager", 3);
hierarchicalTable.Rows.Add(5, "Bilbo Baggins", "Regional Sales Manager", 3);
hierarchicalTable.Rows.Add(6, "Smeagol", "Collections", 3);
hierarchicalTable.Rows.Add(7, "Euphegenia Doubtfire", "Sales Assistant", 2);
hierarchicalTable.Rows.Add(8, "Hermann Hesse", "VP R&D", 1);
hierarchicalTable.Rows.Add(9, "Tom Sawyer", "Researcher", 8);
hierarchicalTable.Rows.Add(10, "Isaac Asimov", "Futurist", 8);
hierarchicalTable.Rows.Add(11, "Mini-Me", "R&D Assistant", 8);
' Add the data
hierarchicalTable.Rows.Add(1, "Edgar Allan Poe", "CEO", Nothing)
hierarchicalTable.Rows.Add(2, "Blondie (Man with No Name)", "VP Sales", 1)
hierarchicalTable.Rows.Add(3, "Mary Poppins", "Sales Director", 2)
hierarchicalTable.Rows.Add(4, "Frodo Baggins", "Regional Sales Manager", 3)
hierarchicalTable.Rows.Add(5, "Bilbo Baggins", "Regional Sales Manager", 3)
hierarchicalTable.Rows.Add(6, "Smeagol", "Collections", 3)
hierarchicalTable.Rows.Add(7, "Euphegenia Doubtfire", "Sales Assistant", 2)
hierarchicalTable.Rows.Add(8, "Hermann Hesse", "VP R&D", 1)
hierarchicalTable.Rows.Add(9, "Tom Sawyer", "Researcher", 8)
hierarchicalTable.Rows.Add(10, "Isaac Asimov", "Futurist", 8)
hierarchicalTable.Rows.Add(11, "Mini-Me", "R&D Assistant", 8)
Full Tree
Now in order to see the whole tree, let’s first run a query from the top without any restrictions.
// Full tree query
startRow = hierarchicalTable.AsEnumerable().Where
(x => x.Field<int>("Id") == 1).FirstOrDefault();
var query1 = from person in hierarchicalTable.AsEnumerable().AsTree("Id", "ManagerId", startRow)
select person;
System.Console.WriteLine("=============================================");
System.Console.WriteLine(string.Format("Tree for {0}", startRow["Name"]));
System.Console.WriteLine("=============================================");
foreach (DataTableTree.TreeItem item in query1) {
System.Console.WriteLine(string.Format(" Level {0}: {1}{2}",
item.Level,
new string(' ', item.Level * 3),
item.Row["Name"]));
}
' Full tree query
startRow = hierarchicalTable.AsEnumerable().Where(Function(x) _
x.Field(Of Integer)("Id") = 1).FirstOrDefault()
Dim query1 = From person In hierarchicalTable.AsEnumerable().AsTree("Id", "ManagerId", startRow)
Select person
System.Console.WriteLine("=============================================")
System.Console.WriteLine(String.Format("Tree for {0}", startRow("Name")))
System.Console.WriteLine("=============================================")
For Each item As DataTableTree.TreeItem In query1
System.Console.WriteLine(String.Format(" Level {0}: {1}{2}",
item.Level,
New String(" ", item.Level * 3),
item.Row("Name")))
Next item
First of all, the starting row is defined. I've selected the top-most row based on the id, 1
. After that, the actual LINQ query is defined. In order to illustrate the results, each row found is printed to the console with small amount of formatting so the result would look like this:
=============================================
Tree for Edgar Allan Poe
=============================================
Level 0: Edgar Allan Poe
Level 1: Blondie (Man with No Name)
Level 2: Mary Poppins
Level 3: Frodo Baggins
Level 3: Bilbo Baggins
Level 3: Smeagol
Level 2: Euphegenia Doubtfire
Level 1: Hermann Hesse
Level 2: Tom Sawyer
Level 2: Isaac Asimov
Level 2: Mini-Me
Partial Tree
The next test is to query the tree partially. If we want to select all rows starting from Blondie, we simply need to define another starting row based on the Id
(2
). So the code would look like:
// Partial tree query
startRow = hierarchicalTable.AsEnumerable().Where(x => x.Field<int>("Id") == 2).FirstOrDefault();
var query2 = from person in hierarchicalTable.AsEnumerable().AsTree("Id", "ManagerId", startRow)
select person;
System.Console.WriteLine();
System.Console.WriteLine("=============================================");
System.Console.WriteLine(string.Format("Tree for {0}", startRow["Name"]));
System.Console.WriteLine("=============================================");
foreach (DataTableTree.TreeItem item in query2) {
System.Console.WriteLine(string.Format(" Level {0}: {1}{2}",
item.Level,
new string(' ', item.Level * 3),
item.Row["Name"]));
}
' Partial tree query
startRow = hierarchicalTable.AsEnumerable().Where(Function(x) _
x.Field(Of Integer)("Id") = 2).FirstOrDefault()
Dim query2 = From person In hierarchicalTable.AsEnumerable().AsTree("Id", "ManagerId", startRow)
Select person
System.Console.WriteLine()
System.Console.WriteLine("=============================================")
System.Console.WriteLine(String.Format("Tree for {0}", startRow("Name")))
System.Console.WriteLine("=============================================")
For Each item As DataTableTree.TreeItem In query2
System.Console.WriteLine(String.Format(" Level {0}: {1}{2}",
item.Level,
New String(" ", item.Level * 3),
item.Row("Name")))
Next item
And the result would be:
============================================= Tree for Blondie (Man with No Name) ============================================= Level 0: Blondie (Man with No Name) Level 1: Mary Poppins Level 2: Frodo Baggins Level 2: Bilbo Baggins Level 2: Smeagol Level 1: Euphegenia Doubtfire
Note that the levels are now different since the 0
level is always the row where the query is started from.
Bottom-up Query
So far, we've made only top-down queries. The AsTree
had a parameter which could be used to define the direction
for the traversal. So this time, let's have a look at the relations starting from Smeagol. The code would look like:
// Reverse tree query
startRow = hierarchicalTable.AsEnumerable().Where
(x => x.Field<string>("Name") == "Smeagol").FirstOrDefault();
var query3 = from person in hierarchicalTable.AsEnumerable().AsTree
("Id", "ManagerId", startRow, DataTableTree.TraverseDirection.FromChildToParent)
select person;
System.Console.WriteLine();
System.Console.WriteLine("=============================================");
System.Console.WriteLine(string.Format("Reverse tree for {0}", startRow["Name"]));
System.Console.WriteLine("=============================================");
foreach (DataTableTree.TreeItem item in query3) {
System.Console.WriteLine(string.Format(" Level {0}: {1}{2}",
item.Level,
new string(' ', item.Level * 3),
item.Row["Name"]));
}
' Reverse tree query
startRow = hierarchicalTable.AsEnumerable().Where(Function(x) _
x.Field(Of String)("Name") = "Smeagol").FirstOrDefault()
Dim query3 = From person In hierarchicalTable.AsEnumerable().AsTree_
("Id", "ManagerId", startRow, DataTableTree.TraverseDirection.FromChildToParent)
Select person
System.Console.WriteLine()
System.Console.WriteLine("=============================================")
System.Console.WriteLine(String.Format("Reverse tree for {0}", startRow("Name")))
System.Console.WriteLine("=============================================")
For Each item As DataTableTree.TreeItem In query3
System.Console.WriteLine(String.Format(" Level {0}: {1}{2}",
item.Level,
New String(" ", item.Level * 3),
item.Row("Name")))
Next item
Now the result is:
=============================================
Reverse tree for Smeagol
=============================================
Level 0: Smeagol
Level 1: Mary Poppins
Level 2: Blondie (Man with No Name)
Level 3: Edgar Allan Poe
Restrict the Results in the Query
The last test is to restrict the results in the LINQ query, just to make sure everything works as expected. For example, if we want to list all the assistants and their managers, the code could look like:
// Restricting results, all assistants
startRow = hierarchicalTable.AsEnumerable().Where
(x => x.Field<int>("Id") == 1).FirstOrDefault();
var query4 = from person in hierarchicalTable.AsEnumerable().AsTree("Id", "ManagerId", startRow)
where person.Row.Field<string>("Title").Contains("Assistant")
select person;
System.Console.WriteLine();
System.Console.WriteLine("=============================================");
System.Console.WriteLine(string.Format("All assistants and their managers"));
System.Console.WriteLine("=============================================");
foreach (DataTableTree.TreeItem item in query4) {
System.Console.WriteLine(string.Format(" Level {0}: {1}{2} (manager: {3})",
item.Level,
new string(' ', item.Level * 3),
item.Row["Name"],
item.ParentRow["Name"]));
}
' Restricting results, all assistants
startRow = hierarchicalTable.AsEnumerable().Where(Function(x) _
x.Field(Of Integer)("Id") = 1).FirstOrDefault()
Dim query4 = From person In hierarchicalTable.AsEnumerable().AsTree("Id", "ManagerId", startRow)
Where person.Row.Field(Of String)("Title").Contains("Assistant")
Select person
System.Console.WriteLine()
System.Console.WriteLine("=============================================")
System.Console.WriteLine(String.Format("All assistants and their managers"))
System.Console.WriteLine("=============================================")
For Each item As DataTableTree.TreeItem In query4
System.Console.WriteLine(String.Format(" Level {0}: {1}{2} (manager: {3})",
item.Level,
New String(" ", item.Level * 3),
item.Row("Name"),
item.ParentRow("Name")))
Next item
And now the result would be:
=============================================
All assistants and their managers
=============================================
Level 2: Euphegenia Doubtfire (manager: Blondie (Man with No Name))
Level 2: Mini-Me (manager: Hermann Hesse)
Conclusions
Querying a hierarchical data table is actually very simple. However, the code provided doesn't cover all possible scenarios, so feel free to modify it to better suit your needs.
History
- 5th October, 2015: Created