12,815,986 members (32,911 online)
Add your own
alternative version

#### Stats

30.8K views
393 downloads
8 bookmarked
Posted 25 Apr 2009

# A Data Structure to Store Dictionaries

, 27 Apr 2009 CPOL
 Rate this:
Please Sign up or sign in to vote.
A data structure to store dictionaries in memory space-efficiently

## Introduction

Excerpt from The Algorithm Design Manual by Steven S. Skiena: The abstract data type "dictionary" is one of the most important structures in Computer Science. Dozens of different data structures have been proposed for implementing dictionaries including hash tables, skip lists, and balanced/unbalanced binary search trees -- so choosing the right one can be tricky. Depending on the application, it is also a decision that can significantly impact performance. In practice, it is more important to avoid using a bad data structure than to identify the single best option available.

This article explains a data-structure that can be used to store a dictionary (words and their definitions) in memory.

## Background

A "Tree", which is a very elementary data structure, has been brought into application here.

## Using the Code

The code consists of the CharacterNode class that stores a char value along with references to 26 other CharacterNodes. The notion behind choosing 26 as the number of references is that this code presently is restricted to only storing the 26 English alphabets.

As can be seen in the diagram, the root object of CharacterNode points towards 'a' and 'f'. The nodes corresponding to the characters 'a' and 'f' further point towards other letters, which further follow the same. The code snippet that describes the CharacterNode class is given:

class CharacterNode
{
char c;
CharacterNode next[]=new CharacterNode[26];
CharacterNode()
{
c='a';
for(int i=0;i<26;i++)
{
next[i]=null;
}
}
//..rest of the class definition
}

The root (see the diagram) always holds the character 'a' (not shown in the diagram). This is because it is the first object formed (as shown below), and any object will always hold 'a' as the initial character (see the constructor above).

public class CharacterNodeTree
{
public static void main(String args[]) throws IOException
{
CharacterNode root=new CharacterNode();
//..rest of the code

The insert() routine describes how a character will be stored in the dictionary.

void insert(char ch)
{
this.c=ch;
}

The insert() routine from main() is called in the following manner:

CharacterNode temp=null;
String str=null;

System.out.print("Enter the string to insert:");
BufferedReader br1=new BufferedReader(new InputStreamReader(System.in));
str=br1.readLine();
root.next[str.charAt(0)-97]=new CharacterNode();
temp=root.next[str.charAt(0)-97];
for(int i=0;i<str.length();i++)
{
temp.insert(str.charAt(i),i);
try
{
temp.next[str.charAt(i+1)-97]=new CharacterNode();
temp=temp.next[str.charAt(i+1)-97];
}
catch(StringIndexOutOfBoundsException e) {}
}

Hence, a simple calculation allows the insertion of an alphabet from the string. This calculation is nothing but the position of the alphabet in the alphabetical sequence (like 0 for 'a', 1 for 'b', 25 for 'z', etc.). So forth, the other alphabets from the string can be inserted, and therefore the whole of the string can be inserted.

The find() routine describes how a complete string will be retrieved from the dictionary.

boolean find(String str)
{
CharacterNode temp=this;
for(int i=0;i<str.length();i++)
{
if(temp.next[str.charAt(i)-97]==null)
{
return false;
}
else
{
temp=temp.next[str.charAt(i)-97];
continue;
}
}
return true;
}

The find() routine from main() is called in the following manner:

System.out.print("Enter the string to find:");
BufferedReader br2=new BufferedReader(new InputStreamReader(System.in));
str=br2.readLine();
if(root.find(str)==true)
{
System.out.println("FOUND");
}

Again, a simple calculation allows the retrieval of a string (or even a part of it).

## Points of Interest

Looking at the diagram, the code can be further modified:

1. A boolean value can be added to the CharacterNode class definition to indicate whether a particular character ends with a meaningful word or not. For example, in the diagram, all the leaves end (and must end due to the structure of the code) with a meaningful word ('aleak', 'alert', 'all', 'food', 'foot' are the words that have been shown added in the diagram). However, 'ale' is a word that does not end with a leaf but is still a meaningful word and is shown added. Therefore, the leaves ('k', 't', 'l', 'd', 't') and the non-leaf node ('e') must have boolean values as true.
2. It is easier to find a word with special properties, like starting with a particular character(s) or ending with a particular character(s) (think about links that directly refer leaves!), or having a fixed length (since all the words with the same length occur at the same level, and it should not be difficult to code a link between all these words!).
3. A recursive routine (using a stack preferably) may provide a way to retrieve all the words lexicographically.

## History

• 25th April, 2009: Initial post
• 26th April, 2009: Uploaded source code

## License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

## About the Author

 Other India
No Biography provided

## You may also be interested in...

 Pro Pro

## Comments and Discussions

 First Prev Next
 Information Member 807643411-Jul-11 19:37 Member 8076434 11-Jul-11 19:37
 My vote of 1 Matt Weber27-Apr-09 6:06 Matt Weber 27-Apr-09 6:06
 My vote of 1 Jean-Paul Mikkers26-Apr-09 13:09 Jean-Paul Mikkers 26-Apr-09 13:09
 Re: My vote of 1 neeraj39927-Apr-09 4:20 neeraj399 27-Apr-09 4:20
 Why reinvent the wheel? zlezj26-Apr-09 1:54 zlezj 26-Apr-09 1:54
 Re: Why reinvent the wheel? neeraj39926-Apr-09 2:53 neeraj399 26-Apr-09 2:53
 Last Visit: 31-Dec-99 19:00     Last Update: 23-Mar-17 10:22 Refresh 1

General    News    Suggestion    Question    Bug    Answer    Joke    Praise    Rant    Admin

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

Permalink | Advertise | Privacy | Terms of Use | Mobile
Web02 | 2.8.170308.1 | Last Updated 27 Apr 2009
Article Copyright 2009 by neeraj399
Everything else Copyright © CodeProject, 1999-2017
Layout: fixed | fluid