Introduction
One thing that .NET is lacking is data structures. Besides arrays, collections and dictionaries, not much effort has been put in this area. That's a pity because most often than not, a judicious choice of a data structure can make a drastic difference in both time and space complexity. Tree-based data structures are amongst the most interesting and it's one of them I chose to explore in this article.
The following description is based on this excellent article by R. Reyes.
Tries are a kind of tree where each node holds a common part of one or more keys. As an example, for a trie holding strings, the words "eater" and "eating" would be stored as follows:

One of the many existing variants of the trie is the PATRICIA trie (Practical Algorithm To Retrieve Information Coded In Alphanumeric), which adds path compression by grouping common sequences of nodes together:

This structure provides a very efficient way of storing values while maintaining the lookup time for a key in O(N) in the worst case, where N is the length of the longest key. This structure has it's main use in IP routing software, but can provide an interesting alternative to other structures such as hashtables when memory space is of concern.
Demo: mapping of IPs to ISO3166 country codes
The demo project contains an actual useful use of tries:
First, it will build a trie with the mapping of IP masks to country codes (about 50000 IP masks).
It will then lookup the country associated to 25000 IP and give the time taken to do so, along with the relative speed measured in IP/sec.
If you want, you can easily take the 3 classes that do the actual work (IPCountryLookupTable
, Trie
and TrieNode
) and build a simple application that will retrieve the country code of an IP or a batch of IP.
The implementation
Before designing and coding the classes, I set myself these goals:
- Simplicity of use
- High speed and memory optimization
- Powerful set of features
- Ease of extension and addition of new implementations
This implementation is a numeric PATRICIA trie. That is, a trie that can only hold 32 bit integers (signed or not). However, I designed the code with the goal of making at least two additional implementations:
- Variable length strings
- Big numbers (any base up to 36)
To ensure consistency amongst all types of tries, I made two interfaces, ITrie
and ITrieNode
, defining the basic capacities of a trie and its nodes.
The reason behind this choice as a first implementation is simply because I got myself into writing this article upon looking at the following projects, which both required this kind of trie:
The current implementation supports:
IList
(incl. IEnumerator
)
ICloneable
- Serialization
- Merging one trie into another one
- Creating a new trie by merging two other tries
- Testing for equality (does both tries have the same content?)
- Finding both approximate and exact matches
The classes are highly optimized and memory efficient, even though I think there is always room for improvement in those matters :)
Using the code
The code is well documented, so I will simply provide a quick example showing the capacities of the Trie
class and the simplicity of its use.
Trie trie1 = new Trie();
trie1.Add(123, 32);
trie1.Add(123, 16);
foreach (Int32 key in trie1)
Console.WriteLine(key);
TrieNode node = trie1.Add(1234, 32);
node.Data =
"Data is of type object, so anything can be put there.";
trie1[1234] = "This is streching the meaning of " +
"indexer a bit, but it is quite convenient.";
Int32[] keys;
trie1.GetKeys(out keys);
Int32[] keyLengths;
trie1.GetKeys(out keys, out keyLengths);
trie1.GetKeys(ref keys, 0);
trie1.CopyTo(keys, 0);
trie1.GetKeys(ref keys, ref keyLengths, 0);
ArrayList keyList = new ArrayList();
trie1.GetKeys(keyList, true);
trie1.Remove(123, 16);
Trie trie2 = new Trie();
for (Int32 i = -10; i < 10; i++)
trie2.Add(i, 32);
Trie trie3 = Trie.Merge(trie1, trie2);
trie1.Merge(trie2);
Trie clone = trie1.Clone();
bool result = trie1.Contains(9);
result = trie1.Contains(9, 32);
node = trie1.FindBestMatch(1239, 32);
node = trie1.FindExactMatch(1239, 32);
trie1.Clear();
Related articles