Click here to Skip to main content
Click here to Skip to main content

Design a Dictionary with Spellchecker (En-Fa)(De-En)

By , 21 Oct 2007
Rate this:
Please Sign up or sign in to vote.

Background

About a month ago, I was searching for a Windows dictionary application that was written in C#, but I couldn't find anything and decided to write it myself.

Introduction

This dictionary has auto-complete functionality and a small spell checker and now works with two text databases – "English to Persian (Farsi)" and "German to English".

Screenshot - MainForm.gif

Application Design

When I began to write this dictionary, I chose a "three-tier application design" for the main design structure. Why "three-tier application design"? Because we have a GUI, a text file as vocabulary database, and a layer that must be the interface between the GUI and the database. In my opinion, when we have a database and some user interfaces and we want to encapsulate database access (because we don't know the type of the database – in this case a text file), using this design approach is a good practice.

Three-Tier Application Design

The three tiers mentioned above have been defined as below: [1]

  1. Data Tier: Access to the database is defined in this tier. This tier includes static classes and acts as an interface between the text database and the business tier. It is obvious that we don't store any data in this tier but it is a tool that has permission to load and modify the database. The code for this part could be found in the Data folder in Project directory.
  2. Business Tier: Includes classes that define the structure for storing data and it is an interface between data tier and presentation tier. This tier is the core of this application and you can find the code for this section in Core folder in Project directory.
  3. Presentation Tier: This tier consists of the GUIs that have direct access to the business tier and has nothing to do with the data tier. The code is placed in the Form folder.

The relationship between the three tiers is shown in the diagram below:

Screenshot - n-Tier-Design.gif

Data Tier and Text Database Design

To define the text database as a vocabulary repertoire, we have a text file that has the entries and their definitions described in the following way:

Entry :: Definition

If an entry has multiple definitions, a (#) must be added after the previous definition.

Screenshot - Sample-TextDataBase.gif

You can see "English to Persian" and "German to English" samples below:

Screenshot - De-En-TextDatabasetxt.gifScreenshot - En-Fa-TextDatabasetxt.gif

Now we must define classes to interact with the text database.

There are two static classes:

  • TextDatabase is for reading and loading the database into a particular structure
  • TextDatabaseModifier class is for adding an entry and its definition to the text database.

Screenshot - TextDatabase_-_Digram.gif

Business Tier Design

First step: We define a structure for storing each entry (Key) and its definition (Value) and its index (Index) in the dictionary, and we name it DictionaryKeyIndexValue.

Second step: We define a class and name it DictionaryPack, the heart of our program. This class is made up of two fields:

System.Collections.Generic.Dictionary<string, Core.DictionaryKeyIndexValue> _dictionary

and

System.Collections.Generic.List<Core.DictionaryKeyIndexValue> _list

This class interacts with the data tier. When we read each line of text database in data tier, we add a DictionaryKeyIndexValue object to _dictionary and _list in a DictionaryPack instance.

You may ask yourself "Are we crazy to add the same object to _dictionary and _list? It uses twice the memory needed!!". Yes, you are right. This needs more memory, but it is a good practice and speeds up the program and the same time enhances the flexibility of the code. It is a tradeoff between memory usage and having speed and flexibility.

So why do we need Dictionary and List at the same time? Because we can benefit from this approach as shown in the following way.

If we look up the definition of a word, we have to use the Dictionary object to find the definition but when we need to get a list of words (e.g. in auto-complete functionality), using Dictionary object is not feasible because it has no indexing ability, therefore we have to use the List object.

Anyway, it has its disadvantages too. For example, if you use German to English dictionary (that has about 240,000 words) you need 70MB of memory and if you use English to Persian dictionary (that has about 60,000 words) 20MB of memory is needed.

You can see all properties and methods in the class diagram below:

Screenshot - Pack___KIV_-_Diagram.gif

Using Code

Using the dictionary is very simple. First we have to instantiate an object of DictionaryPack as shown in the following code:

private Core.DictionaryPack _dictionaryPack;
private void EyeDictionaryForm_Load(object sender, EventArgs e){
//Choose languages for translating}
_dictionaryPack = Core.DictionaryPack.LoadDictionary
	(EyeDictionary.Core.TranslatingLanguages.DeutschlandToEnglish); 

Now we show how to use DictionaryPack methods.

("Key" in the code samples below is the word entered by user.)

  1. If we want only one definition of a word, we use:

    string value = _dictionaryPack.GetValue(key);
  2. If we want all the definitions of a word, we use:

    String[] meanings = _dictionaryPack. GetMeanings(key);
  3. If we want the index of a specific key:

    int index = _dictionaryPack. IndexOf(key);
  4. If we want to know whether _dictionaryPack contains a specific key:

    bool exist = _dictionaryPack. ContainsKey(key);

Auto-Complete

This application has a ListBox to provide auto-complete functionality. If we want to get auto completed words in list box, we must get the word that is most similar to the word that we enter:

DictionaryKeyIndexValue kiv = Level1AutoCompeleteWord(key, AutoCompeleteLevel.Level3));
int properIndex; 
string[] autos = _dictionaetPack.GetAutoCompletedBoundaries(kiv.Index,properIndex);<o:p>
ListBox.SelectedIndex = properIndex;
listBoxAutoCompleteWords.Items.AddRange(autos); //Fill ListBox with autos

Spellchecking

For using the spellchecker:

string[] words = GetSugesstionWords(key, Core.SugesstionLevel.Level6, append)

There are 6 levels for spellchecking. In each level, we get different suggestion words and if append is true, then all the suggestions of the lower levels and current level will be given.

Each level is described as follows:

1. Core.SugesstionLevel.Level1

In this level, if the user misspells a word, GetSugesstionWords() returns all of the meaningful words that have one character difference with the misspelled word or have one less character than the misspelled one.

Screenshot - level1.gif

2. Core.SugesstionLevel.Level2

It's possible that the word entered by the user has two adjacent characters in the wrong order. This level deals with this kind of misspelling.

Screenshot - level2.gif

3. Core.SugesstionLevel.Level3

If the user has omitted one character in a word, all the words with an extra character anywhere in the misspelled word are returned.

Screenshot - level3.gif

4. Core.SugesstionLevel.Level4

The same as level three but with two characters missing anywhere in the misspelled word.

Screenshot - level4.gif

5. Core.SugesstionLevel.Level5

Combination of other levels.

6. Core.SugesstionLevel.Level6

Screenshot - level6.gif

Returns all the words and phrases containing the word entered.

As you can see with level6, we have many suggested words.

By setting append to true, we can have a combination of all lower levels and the current level.

Using level 3 is advised because if we use level 6 and set append to true, we have too many results.

Notice

All settings (such as text database file path . . .) are stored in a static class in Global folder. For example, if you want to get the text database file path, you must use the code:

Global.Settings.Dictionary.CurrentUsedDictionaryPath

You can add multiple dictionaries for better spellchecking and more definitions, like the concise version of Babylon® dictionary.

Conclusion

At the end, I should mention that this is just about an application and not an article and doesn't represent a fast and/or best way for developing a dictionary.

Thank you for reading.

Database and References

German to English

  • # Version: 1.5 2007-04-09
  • # Copyright (c): Frank Richter <frank.richter.tu-chemnitz.de>
  • # 1995 - 2007
  • # License: GPL Version 2 or later; GNU General Public License
  • # URL: http://dict.tu-chemnitz.de/

English to Farsi

=======================================================================

[1] - Apress Beginning C# 2005 Databases

History

  • Update - "De to En" {German(Deutschland) to English} text database fixed

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

Hamid Attari
Software Developer TSDDC (TehranShomal Design and Development Center)
Iran (Islamic Republic Of) Iran (Islamic Republic Of)
No Biography provided

Comments and Discussions

 
QuestionData file Pinmemberbahram_h16-Mar-13 17:34 
Questionerror PinmemberDaniel Navarro7-Aug-12 10:14 
GeneralHi Hamid PinmemberAli Javani19-Feb-12 7:16 
QuestionCan't Add PinmemberSharpCodes18-Dec-11 9:23 
GeneralMy vote of 4 PinmemberSharpCodes18-Dec-11 9:11 
QuestionHow to embed or encrypt or hide txt file Pinmemberalrsds19-Aug-09 3:02 
Questionthe last entry creates error Pinmembercintakumara30-Dec-08 2:01 
AnswerRe: the last entry creates error PinmemberHamid Attari5-Jan-09 22:49 
QuestionRe: the last entry creates error Pinmembercintakumara16-Feb-09 0:55 
QuestionHow to switch to another database? Pinmembercintakumara23-Dec-08 2:34 
AnswerRe: How to switch to another database? PinmemberHamid Attari5-Jan-09 22:46 
GeneralRe: How to switch to another database? Pinmemberwolt113-Mar-13 2:13 
QuestionHow to implement html code? Pinmembercintakumara21-Dec-08 21:55 
QuestionUr dictionary is the best Pinmembercintakumara4-Dec-08 20:52 
AnswerRe: Ur dictionary is the best [modified] PinmemberHamid Attari8-Dec-08 19:40 
Questioncan you upload the older version of the program Pinmemberrahaandish23-Sep-08 0:36 
Questionillustrated dictionary Pinmemberbms916-Jun-08 2:52 
AnswerRe: illustrated dictionary PinmemberHamid Attari16-Jun-08 12:14 
Questionuse images and video files in dictionary Pinmemberbms918-May-08 17:29 
GeneralI can't download it Pinmemberrahaandish17-May-08 23:32 
GeneralRe: I can't download it PinmemberHamid Attari18-May-08 6:21 
GeneralHelp me to do a simple dictionary English to Telugu PinmemberMember 393193025-Dec-07 21:23 
GeneralGreat job~~~But something has to be considered PinmemberIzchi24-Nov-07 18:32 
GeneralDictionay_De_to_En_Demo.zip Pinmembercampania23-Oct-07 6:16 
AnswerRe: Dictionay_De_to_En_Demo.zip [modified] PinmemberHamid Attari23-Oct-07 6:44 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Mobile
Web02 | 2.8.140415.2 | Last Updated 21 Oct 2007
Article Copyright 2007 by Hamid Attari
Everything else Copyright © CodeProject, 1999-2014
Terms of Use
Layout: fixed | fluid