Click here to Skip to main content
15,881,204 members
Articles / Programming Languages / Visual Basic

Wordley

Rate me:
Please Sign up or sign in to vote.
4.73/5 (20 votes)
9 Mar 2011CPOL9 min read 31.1K   1.6K   33   6
A look-up dictionary based on the WordNet database, with auto-complete spelling assistance.

Wordley_ScreenShot.jpg

**Note: This update fixes a non-fatal bug in the word search dialog that could cause Wordley to throw an exception with some words.**

Introduction

Wordley is similar in operation to the WordNet [^] application maintained by Princeton University, and uses a subset of WordNet's lexical database as its source of terms and definitions. About the only thing I don't like about WordNet is that to look up a word, you have to actually know how to spell it. If you can't spell it, you can't look it up except by trial and error.

To make looking up words and phrases a bit easier, Wordley incorporates a Windows auto-complete feature that provides suggestions as you type. Click a suggestion, or type the whole word and press Enter, and the definition is provided in the display area below the text box. If your search term isn't in the dictionary, Wordley will search for similar words and offer suggestions if it finds any.

I've tried to make Wordley dirt-simple to use, and while it doesn't include the entire WordNet database, it does provide a dictionary of over 80,000 words and phrases.

To Find a Word or Phrase:

Type or paste your search term in the text box located near the top of the main window. As you type, a drop-down list containing words that start with the text you've typed will open. If Wordley's dictionary contains your search term, it will appear in the list. Click the desired list item to see the definition. Alternatively, you can type the search term and tap the Enter key to get the definition. For best results, type the first letter of your query and hesitate for a half-second to allow the list to open. Then continue typing.

When You Tap the Enter Key...

If the search term does not exist in the dictionary, Wordley will search the dictionary for similar words. If viable suggestions are found, they will be displayed in a small dialog box. Click a suggestion in the list and click the "Accept" button to see the definition. Wordley will inform you accordingly if no suggestions are found.

If your search term is in the dictionary, results are instantaneous. If Wordley has to search for suggestions, you'll notice a delay of two to three seconds.

To Copy Text to the Clipboard:

Once a term has been selected, you can use the standard Windows keyboard shortcut (Ctrl + C) to copy text from the search box or the definition display.

To Paste Text into the Search Box:

Use Ctrl + V to paste text into the search box. This triggers the suggestion search mentioned above, so you'll either see a definition or a suggestion list a few seconds after pasting.

Internet Search Feature

Once you've typed your search term, you can also click the Internet Search button (just right of the text box) to search the web for your term. This is especially handy if Wordley happens to not contain your term. The search opens in your default browser. In the options dialog (more below on options), you can select from Yahoo!, Google, or Bing as Wordley's search engine.

Options Dialog

Click the Options button (the wrench and screwdriver) to open Wordley's tiny Options dialog box. Here you can choose Wordley's default search engine for web searches, whether it opens normally or maximized, and whether it keeps itself on top of other windows.

Help and About Buttons

Opens the About dialog and Wordley's short help file.

Using the Code

On Load...

Wordley's resources contain four data files in plain text format. Each of these four files represents one part of speech (adjective, adverb, noun, and verb). They are named accordingly.

In Form1's code, you'll notice in the declarations region, five dictionaries:

VB
Dim adv As New Dictionary(Of String, String)(StringComparer.CurrentCultureIgnoreCase)
Dim vrb As New Dictionary(Of String, String)(StringComparer.CurrentCultureIgnoreCase)
Dim noun As New Dictionary(Of String, String)(StringComparer.CurrentCultureIgnoreCase)
Dim adj As New Dictionary(Of String, String)(StringComparer.CurrentCultureIgnoreCase)
Friend masterList As New Dictionary(Of String, String)_
      (StringComparer.CurrentCultureIgnoreCase)

During the Load event, each of these collections is populated from files in My.Resources. Here's the code block for populating the noun dictionary:

VB
Dim loadNOUN() As String = My.Resources.data_NOUN.Split(Chr(10))
For l = 0 To loadNOUN.Length - 1
  Dim tmp() As String = loadNOUN(l).Split("|")
  If Not noun.ContainsKey(tmp(0)) Then
    noun.Add(tmp(0), tmp(1))
    If Not masterList.ContainsKey(tmp(0)) Then
      masterList.Add(tmp(0), tmp(0))
    End If
  End If
Next

In the code above, the loadNoun() string array initially contains the text from My.Resources.data_NOUN. Once loadNOUN has the data, Wordley loops through the array to populate the dictionary, "noun". This is done by creating an array called tmp with each pass through the loop. tmp has two elements. Element #0 contains the word or phrase, and element #1 contains the definition. A check against duplicate keys - If Not noun.ContainsKey(tmp(0)) - remains in the code, but the data files have had duplicates removed before they are added to Wordley's resources. No harm in playing it safe.

You'll also notice that the masterList dictionary is populated at the same time:

VB
masterList.Add(tmp(0), tmp(0))

The masterList dictionary holds only the words from the four data files, not the definitions. This collection populates Wordley's AutoCompleteStringCollection, autoComp:

VB
Dim autoComp As AutoCompleteStringCollection

This is done at the end of the Load event:

VB
autoComp = txt_Srch.AutoCompleteCustomSource
autoComp.AddRange(masterList.Keys.ToArray)

The txt_Srch control is the TextBox you type in to start the process of getting a definition.

At first glance, one might expect the Load event to be a bit slow (I did). But it only takes a couple of seconds with a dual core machine.

Once Wordley has loaded and you have a blinking caret in the TextBox, you can type a search term. Note that the AutoComplete function wasn't really meant for such a large list (about 80,000 entries). Type the first letter and hesitate a half-second or so to let the list open. Then continue typing. If you get too far ahead and the list closes, you can still just tap the Enter key once you've typed your search term.

Clicking a list item or tapping Enter triggers the GetDefinition() Sub. Wordley first checks to see if the search text exists in masterList. If it does, Wordley will generate the HTML to display the definitions in the WebBrowser control below the TextBox. The basic HTML is in My.Resources.DefinitionHTML.

First, two local variables are declared to contain the HTML and the text from the search TextBox:

VB
Dim txt As String = txt_Srch.Text.Trim 'search text
Dim dispStr As String = My.Resources.DefinitionHTML 'HTML

Next, Wordley checks each dictionary to see if the term is defined there. Here's the HTML generation for a noun:

VB
If noun.ContainsKey(txt) Then
   Dim tmp() As String = noun(txt).Split(";")
   dispStr &= "<span class=" & Chr(34) & "word" & Chr(34) & ">"
   dispStr &= "(NOUN)</span>"
   dispStr &= "<ul>" & Chr(10)
   For l = 0 To tmp.Length - 1
    dispStr &= "<li>" & tmp(l) & "</li>"
   Next
   dispStr &= "</ul>" & Chr(10)
   dispStr &= "<hr align=" & Chr(34) & "center" & Chr(34) & _
              "width=" & Chr(34) & "300" & Chr(34) & ">"
End If

Triggering the GetDefinition() Sub...

Notice that there is an event handler for txt_Srch's KeyUp event:

VB
Private Sub txt_Srch_KeyUp(ByVal sender As Object, _
        ByVal e As System.Windows.Forms.KeyEventArgs) Handles txt_Srch.KeyUp
    'Enter key
    If e.KeyCode = Keys.Enter Then
      GetDefinition()
    End If
    'paste
    If e.Modifiers.Equals(Keys.Control) AndAlso e.KeyCode = Keys.V Then
      GetDefinition()
      e.Handled = True
    End If
End Sub

Clicking an item in the drop-down list behaves the same as pressing the Enter key while the TextBox has focus. This way the definitions are displayed either by clicking a list item or by pressing Enter if the list is closed. Pasting a search term in the textbox (Ctrl+V) also triggers the sub. If the search term is not found in masterList, a message box will inform you.

The Data Files...

Here's an entry from one of the data files:

composite|consisting of separate interconnected parts;
    of or relating to or belonging to the plant family Compositae

There are two delimiters in each entry. The pipe character "|" separates the term (on the left) from the definition. This is what the tmp array handles during the Load event:

VB
Dim tmp() As String = loadNOUN(l).Split("|")

When Wordley checks the dictionaries - If noun.ContainsKey(txt) Then... - the tmp array splits the Value of the KeyValue pair into another array. The semicolon ";" character is the delimiter between definitions. You'll note that a lot of the definitions include elements contained in quotes. These are sentences demonstrating word use.

Regarding Spelling-Assist (when your search term isn't in the dictionary)...

Wordley's search logic is based on a subset of a spell checker project I've been working on for about a year. If you have Wordley's source code, you'll find two files in Wordley's Resources: "letterlist" and "typolist". The typolist resource contains a list of 5,646 commonly misspelled words and their corrections. The letterlist resource is also a list of common spelling errors, but rather than whole words, it contains "looks like" and "sounds like" errors (f/ph, ie/ei, etc). The third and final check is a call to ExtraLetter(), which deals with erroneous double-letters and most "neighboring key" errors ("op" instead of just "o").

Wordley first searches the typolist for a misspelling. Then it searches your text for the existence of an error from the letterlist. Lastly, it checks for extra letters. This isn't as powerful as a spell checker like those found in MS Word or OpenOffice, but it does well with minor spelling errors. Typing "sojern" or "sogourn" for example will return "sojourn". But, "sogern" will not return results.

The "Letter List"...

Here is a part of the letterlist resource and an explanation of how it works:

ae|ea
air|are,aire
aire|are,air
ard|erd,ird,ared
are|air,aire
arn|orn,ourn,irn,ern
ately|itely

The letterlist dictionary is used by the CheckLetterList() sub. Note the pipe ("|") character in each element of the list. This is the delimiter between the key and the value. When CheckLetterList() is called, Wordley loops through letterlist. If your search term contains a key in letterlist, the value from that key is split into a string array. The substring in the search term matching the key is replaced with each element in the array created from the value. That new string is checked against the dictionary. If the resulting key exists, it's added to the suggestion list. Here's the code:

VB
Private Sub CheckLetterList()
    For ll As Integer = 0 To ltrList.Count - 1 'loop thru letterlist
      If myStr.Contains(ltrList.Keys(ll)) Then 'if letterlist key exists in myStr
        Dim sugg() As String = ltrList.Values(ll).Split(",") 'create array from value
        For Each chkStr As String In sugg 'loop thru value array
          For readMyStr As Integer = 0 To myStr.Length - 1  'loop thru myStr
            Try
              If myStr.Substring(readMyStr).Length >= ltrList.Keys(ll).Length Then
              'avoid index error
                If myStr.Substring(readMyStr, _
                    ltrList.Keys(ll).Length).Equals(ltrList.Keys(ll)) Then
                  newStr = myStr.Remove(readMyStr, ltrList.Keys(ll).Length)
                  newStr = newStr.Insert(readMyStr, chkStr)
                  If main.masterList.ContainsKey(newStr.Trim) _
                  AndAlso Not suggList.ContainsKey(newStr.Trim) Then
                    suggList.Add(main.masterList(newStr.Trim), _
                                 main.masterList(newStr.Trim))
                  End If
                End If
              End If

            Catch ex As Exception
              MsgBox(ex.ToString, MsgBoxStyle.Exclamation, main.title)
            End Try
          Next readMyStr
        Next 'each chkStr
      End If
    Next ll
End Sub

The typolist dictionary...

Here's a KeyValue pair from the typolist resource: abritrary|arbitrary.

The item left of the pipe character is the key, and represents a common spelling error. If your search term matches the key, it's replaced with the value. Here's the code:

VB
Private Sub CompareTypoList()
    Try
      If typoList.ContainsKey(myStr.ToLower) Then
        Dim sugg() As String = typoList(myStr.ToLower).Split(",")
        For lb As Integer = 0 To sugg.Length - 1

          If main.masterList.ContainsKey(sugg(lb).Trim) _
        AndAlso Not suggList.ContainsKey(sugg(lb).Trim) Then
            suggList.Add(main.masterList(sugg(lb)), main.masterList(sugg(lb)))
          End If

        Next lb
      End If
    Catch ex As Exception
      MsgBox(ex.ToString, MsgBoxStyle.Exclamation, main.title)
    End Try
End Sub

Extra Letters...

The ExtraLetter() sub simply loops through your search term and removes one letter at a time, then checks the resulting new string against the dictionary. If for example you type "babboon", this sub will add "baboon" to the suggestion list. Here's the code:

VB
Private Sub ExtraLetter()
    For l = 0 To myStr.Length - 1
      newStr = myStr.Remove(l, 1)
      Try
        If main.masterList.ContainsKey(newStr.Trim) _
        AndAlso Not suggList.ContainsKey(newStr.Trim) Then
          suggList.Add(main.masterList(newStr.Trim), _
                       main.masterList(newStr.Trim))
        End If
      Catch ex As Exception
        MsgBox(ex.ToString, MsgBoxStyle.Exclamation, main.title)
      End Try

    Next

End Sub

Regarding the WordNet Database...

Getting data from the WordNet database is tedious but not overly complex. If you wish to pull and reformat data from WordNet's files, I recommend using a multi-file text editor with a Find-Replace tool that uses Regular Expressions. Spend some time studying the file format at WordNet's website before trying it. It's dry reading, but informative. I don't recommend using a Scintilla-based editor for this. The Scintilla and ScintillaNet components have a line-length restriction and many of the lines in the WordNet database are too long (10,000+ chars). You'll lose data as a result.

As to the slightly corny name for the program... try to think of a name that isn't already in use. "Wordley" is actually used elsewhere, but not as an application. At least not that I saw.

Credits...

History

  • First Release: March 2nd 2011
  • Second Release: Uploaded to CP March 6th, 2011
  • This update adds a couple of features and fixes an occasionally-annoying bug in the Word Search. Also, the letterlist resource has been improved with several more entries for better error detection and comments to make it more readable. I intended to add synonyms and antonyms before updating, but figured I'd better update earlier due to the bug. I'll get the synonyms/antonyms done in a few weeks as time allows and update again.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
United States United States
I'm not an IT guy. Programming has been a hobby for me (and occasionally useful) ever since a sister in-law introduced me to a TI-99 4/A about a million years ago.

The creative challenge is relaxing and enjoyable. As such, I'd never mess up a fun hobby by doing it for a living.

Now, if I can just get Code Project to add "Truck Driver" to the list of job titles in the profiles...

Comments and Discussions

 
QuestionI really like this... Pin
Jim Meadors30-Oct-12 19:45
Jim Meadors30-Oct-12 19:45 
Hi Alan. I was quite impressed with this project when I first saw it as you do many of the things I started studying programming to do. I am now currently writing an article on String Manipulation of XML files and I use your project files as an example. I convert the entire WordNet 3.0 database into 4 files that replace the ones in your project to provide a complete dictionary. I tested it on my i7 and couldn't detect any slowdown but I couldn't detect the slowdown you talk about in your article before. The disadvantage of working on a fast computer. Do you want to preview the files? I'm kind of thinking it will take me another week to get the article ready...
Jim Meadors

GeneralMy vote of 5 Pin
Minhajul Shaoun28-Mar-11 13:53
Minhajul Shaoun28-Mar-11 13:53 
GeneralRe: My vote of 5 Pin
Jim Meadors7-Nov-12 3:22
Jim Meadors7-Nov-12 3:22 
GeneralBug-fix and minor features upgrade Pin
Alan Burkhart5-Mar-11 6:58
Alan Burkhart5-Mar-11 6:58 
GeneralMy vote of 5 Pin
Oakman2-Mar-11 12:22
Oakman2-Mar-11 12:22 
GeneralRe: My vote of 5 Pin
Alan Burkhart2-Mar-11 12:50
Alan Burkhart2-Mar-11 12:50 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.