65.9K
CodeProject is changing. Read more.
Home

String Tree Dictionary

starIconstarIconstarIconstarIcon
emptyStarIcon
starIcon

4.11/5 (5 votes)

Sep 23, 2008

CPOL

2 min read

viewsIcon

58510

downloadIcon

875

Good for decoding a stream of characters with a finite number of predefined tokens

Introduction

A StringTreeDictionary<T> class implements a IDictionary<string, T> interface and is 57% faster than a Dictionary<string, T> at a task described below, 50% slower at adding items and much slower at accessing them. Even so it can be useful if you have to translate a sequence of characters.

Using the Code

A method, MatchKey, provided in a class tries to find a key in a dictionary which matches the beginning of a given string. If the key was found, then it returns a KeyValuePair<string, TKey> object. Otherwise a KeyNotFoundException is thrown. For example, if a dictionary would contain keys aaba, ac, ccbc and abcd, then the MatchKey("accdba") would retaurn a pair {"ac", dict["ac"]}. Note that we do not know a length of a key we are searching for. The method can be used to extract a sequence of meaningful strings (keys) from a continuous stream of characters.

How It Works

Internally, the StringTreeDictionary object represents a node of a tree structure. There are the following fields inside:

public class StringTreeDictionary<T> : IDictionary<string, T>
{
	private T value;
	private bool hasValue = false;
	private Dictionary<char, StringTreeDictionary<T>> childs = null;

A single StringTreeDictionary object contains just one value and a dictionary with child nodes as values and chars as keys. Let's see how the items are stored.

StringTreeDictionary_diagram1.jpg

Such a dictionary could be instantinated this way:

StringTreeDictionary<string> std = new StringTreeDictionary<string> {
	{"bear", "sleuth"},
	{"bee", "grist"},
	{"bird", "flock"},
	{"cat", "clowder"}
};

The diagram clearly shows that the dictionary does not like long keys.

Remarks

  • A StringTreeDictionary<T> is good only for very specific cases; do not use it instead of a standard .NET dictionary.
  • Accessing items with keys longer than 3 characters is terribly slow and is not recommended.
  • The dictionary does not support a true item removal. A faked Remove method only disallows accessing deleted items and sets value to default(TValue), but does not detach unused nodes.
  • The MatchKey method searches for the shortest key matching a string's beginning. If you would add keys acx and ac, then the acx would never be found.

History

  • 2008-09-20 - First version posted
  • 2008-09-20 - Download files updated