Click here to Skip to main content
15,881,812 members
Articles / Programming Languages / C#
Article

Design a Dictionary with Spellchecker (En-Fa)(De-En)

Rate me:
Please Sign up or sign in to vote.
4.83/5 (34 votes)
21 Oct 2007CPOL6 min read 120.5K   12K   76   32
Design a Dictionary with Spellchecker (English to Farsi AND German to English)

Background

About a month ago, I was searching for a Windows dictionary application that was written in C#, but I couldn't find anything and decided to write it myself.

Introduction

This dictionary has auto-complete functionality and a small spell checker and now works with two text databases – "English to Persian (Farsi)" and "German to English".

Screenshot - MainForm.gif

Application Design

When I began to write this dictionary, I chose a "three-tier application design" for the main design structure. Why "three-tier application design"? Because we have a GUI, a text file as vocabulary database, and a layer that must be the interface between the GUI and the database. In my opinion, when we have a database and some user interfaces and we want to encapsulate database access (because we don't know the type of the database – in this case a text file), using this design approach is a good practice.

Three-Tier Application Design

The three tiers mentioned above have been defined as below: [1]

  1. Data Tier: Access to the database is defined in this tier. This tier includes static classes and acts as an interface between the text database and the business tier. It is obvious that we don't store any data in this tier but it is a tool that has permission to load and modify the database. The code for this part could be found in the Data folder in Project directory.
  2. Business Tier: Includes classes that define the structure for storing data and it is an interface between data tier and presentation tier. This tier is the core of this application and you can find the code for this section in Core folder in Project directory.
  3. Presentation Tier: This tier consists of the GUIs that have direct access to the business tier and has nothing to do with the data tier. The code is placed in the Form folder.

The relationship between the three tiers is shown in the diagram below:

Screenshot - n-Tier-Design.gif

Data Tier and Text Database Design

To define the text database as a vocabulary repertoire, we have a text file that has the entries and their definitions described in the following way:

Entry :: Definition

If an entry has multiple definitions, a (#) must be added after the previous definition.

Screenshot - Sample-TextDataBase.gif

You can see "English to Persian" and "German to English" samples below:

Screenshot - De-En-TextDatabasetxt.gifScreenshot - En-Fa-TextDatabasetxt.gif

Now we must define classes to interact with the text database.

There are two static classes:

  • TextDatabase is for reading and loading the database into a particular structure
  • TextDatabaseModifier class is for adding an entry and its definition to the text database.

Screenshot - TextDatabase_-_Digram.gif

Business Tier Design

First step: We define a structure for storing each entry (Key) and its definition (Value) and its index (Index) in the dictionary, and we name it DictionaryKeyIndexValue.

Second step: We define a class and name it DictionaryPack, the heart of our program. This class is made up of two fields:

C#
System.Collections.Generic.Dictionary<string, Core.DictionaryKeyIndexValue> _dictionary

and

C#
System.Collections.Generic.List<Core.DictionaryKeyIndexValue> _list

This class interacts with the data tier. When we read each line of text database in data tier, we add a DictionaryKeyIndexValue object to _dictionary and _list in a DictionaryPack instance.

You may ask yourself "Are we crazy to add the same object to _dictionary and _list? It uses twice the memory needed!!". Yes, you are right. This needs more memory, but it is a good practice and speeds up the program and the same time enhances the flexibility of the code. It is a tradeoff between memory usage and having speed and flexibility.

So why do we need Dictionary and List at the same time? Because we can benefit from this approach as shown in the following way.

If we look up the definition of a word, we have to use the Dictionary object to find the definition but when we need to get a list of words (e.g. in auto-complete functionality), using Dictionary object is not feasible because it has no indexing ability, therefore we have to use the List object.

Anyway, it has its disadvantages too. For example, if you use German to English dictionary (that has about 240,000 words) you need 70MB of memory and if you use English to Persian dictionary (that has about 60,000 words) 20MB of memory is needed.

You can see all properties and methods in the class diagram below:

Screenshot - Pack___KIV_-_Diagram.gif

Using Code

Using the dictionary is very simple. First we have to instantiate an object of DictionaryPack as shown in the following code:

C#
private Core.DictionaryPack _dictionaryPack;
private void EyeDictionaryForm_Load(object sender, EventArgs e){
//Choose languages for translating}
_dictionaryPack = Core.DictionaryPack.LoadDictionary
	(EyeDictionary.Core.TranslatingLanguages.DeutschlandToEnglish); 

Now we show how to use DictionaryPack methods.

("Key" in the code samples below is the word entered by user.)

  1. If we want only one definition of a word, we use:

    C#
    string value = _dictionaryPack.GetValue(key);
  2. If we want all the definitions of a word, we use:

    C#
    String[] meanings = _dictionaryPack. GetMeanings(key);
  3. If we want the index of a specific key:

    C#
    int index = _dictionaryPack. IndexOf(key);
  4. If we want to know whether _dictionaryPack contains a specific key:

    C#
    bool exist = _dictionaryPack. ContainsKey(key);

Auto-Complete

This application has a ListBox to provide auto-complete functionality. If we want to get auto completed words in list box, we must get the word that is most similar to the word that we enter:

C#
DictionaryKeyIndexValue kiv = Level1AutoCompeleteWord(key, AutoCompeleteLevel.Level3));
int properIndex; 
string[] autos = _dictionaetPack.GetAutoCompletedBoundaries(kiv.Index,properIndex);<o:p>
ListBox.SelectedIndex = properIndex;
listBoxAutoCompleteWords.Items.AddRange(autos); //Fill ListBox with autos

Spellchecking

For using the spellchecker:

C#
string[] words = GetSugesstionWords(key, Core.SugesstionLevel.Level6, append)

There are 6 levels for spellchecking. In each level, we get different suggestion words and if append is true, then all the suggestions of the lower levels and current level will be given.

Each level is described as follows:

1. Core.SugesstionLevel.Level1

In this level, if the user misspells a word, GetSugesstionWords() returns all of the meaningful words that have one character difference with the misspelled word or have one less character than the misspelled one.

Screenshot - level1.gif

2. Core.SugesstionLevel.Level2

It's possible that the word entered by the user has two adjacent characters in the wrong order. This level deals with this kind of misspelling.

Screenshot - level2.gif

3. Core.SugesstionLevel.Level3

If the user has omitted one character in a word, all the words with an extra character anywhere in the misspelled word are returned.

Screenshot - level3.gif

4. Core.SugesstionLevel.Level4

The same as level three but with two characters missing anywhere in the misspelled word.

Screenshot - level4.gif

5. Core.SugesstionLevel.Level5

Combination of other levels.

6. Core.SugesstionLevel.Level6

Screenshot - level6.gif

Returns all the words and phrases containing the word entered.

As you can see with level6, we have many suggested words.

By setting append to true, we can have a combination of all lower levels and the current level.

Using level 3 is advised because if we use level 6 and set append to true, we have too many results.

Notice

All settings (such as text database file path . . .) are stored in a static class in Global folder. For example, if you want to get the text database file path, you must use the code:

C#
Global.Settings.Dictionary.CurrentUsedDictionaryPath

You can add multiple dictionaries for better spellchecking and more definitions, like the concise version of Babylon® dictionary.

Conclusion

At the end, I should mention that this is just about an application and not an article and doesn't represent a fast and/or best way for developing a dictionary.

Thank you for reading.

Database and References

German to English

  • # Version: 1.5 2007-04-09
  • # Copyright (c): Frank Richter <frank.richter.tu-chemnitz.de>
  • # 1995 - 2007
  • # License: GPL Version 2 or later; GNU General Public License
  • # URL: http://dict.tu-chemnitz.de/

English to Farsi

=======================================================================

[1] - Apress Beginning C# 2005 Databases

History

  • Update - "De to En" {German(Deutschland) to English} text database fixed

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer TSDDC (TehranShomal Design and Development Center)
Iran (Islamic Republic of) Iran (Islamic Republic of)
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
QuestionRequest for video Pin
Member 1310910510-Apr-17 19:05
Member 1310910510-Apr-17 19:05 
QuestionData file Pin
bahram_h16-Mar-13 17:34
bahram_h16-Mar-13 17:34 
Questionerror Pin
Daniel Navarro7-Aug-12 10:14
Daniel Navarro7-Aug-12 10:14 
GeneralHi Hamid Pin
Ali Javani19-Feb-12 7:16
Ali Javani19-Feb-12 7:16 
QuestionCan't Add Pin
SharpCodes18-Dec-11 9:23
SharpCodes18-Dec-11 9:23 
GeneralMy vote of 4 Pin
SharpCodes18-Dec-11 9:11
SharpCodes18-Dec-11 9:11 
QuestionHow to embed or encrypt or hide txt file Pin
alrsds19-Aug-09 3:02
alrsds19-Aug-09 3:02 
Questionthe last entry creates error Pin
cintakumara30-Dec-08 2:01
cintakumara30-Dec-08 2:01 
AnswerRe: the last entry creates error Pin
Hamid Attari5-Jan-09 22:49
Hamid Attari5-Jan-09 22:49 
QuestionRe: the last entry creates error Pin
cintakumara16-Feb-09 0:55
cintakumara16-Feb-09 0:55 
QuestionHow to switch to another database? Pin
cintakumara23-Dec-08 2:34
cintakumara23-Dec-08 2:34 
AnswerRe: How to switch to another database? Pin
Hamid Attari5-Jan-09 22:46
Hamid Attari5-Jan-09 22:46 
GeneralRe: How to switch to another database? Pin
wolt113-Mar-13 2:13
wolt113-Mar-13 2:13 
QuestionHow to implement html code? Pin
cintakumara21-Dec-08 21:55
cintakumara21-Dec-08 21:55 
QuestionUr dictionary is the best Pin
cintakumara4-Dec-08 20:52
cintakumara4-Dec-08 20:52 
AnswerRe: Ur dictionary is the best [modified] Pin
Hamid Attari8-Dec-08 19:40
Hamid Attari8-Dec-08 19:40 
Questioncan you upload the older version of the program Pin
rahaandish23-Sep-08 0:36
rahaandish23-Sep-08 0:36 
Questionillustrated dictionary Pin
bms916-Jun-08 2:52
bms916-Jun-08 2:52 
AnswerRe: illustrated dictionary Pin
Hamid Attari16-Jun-08 12:14
Hamid Attari16-Jun-08 12:14 
Sorry it took me so long.
With the 3-tier development process that I have used, this dictionary can't support audio and video, because it's database is a simple text file. But with some tweaks, we can approach this goal.
I'll try to publish a new version of the dictionary at the end of summer.
Questionuse images and video files in dictionary Pin
bms918-May-08 17:29
bms918-May-08 17:29 
GeneralI can't download it Pin
rahaandish17-May-08 23:32
rahaandish17-May-08 23:32 
GeneralRe: I can't download it Pin
Hamid Attari18-May-08 6:21
Hamid Attari18-May-08 6:21 
GeneralHelp me to do a simple dictionary English to Telugu Pin
Member 393193025-Dec-07 21:23
Member 393193025-Dec-07 21:23 
GeneralGreat job~~~But something has to be considered Pin
Izchi24-Nov-07 18:32
Izchi24-Nov-07 18:32 
GeneralDictionay_De_to_En_Demo.zip Pin
campania23-Oct-07 6:16
campania23-Oct-07 6:16 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.