Click here to Skip to main content
12,944,769 members (50,825 online)
Click here to Skip to main content
Add your own
alternative version


37 bookmarked
Posted 19 Aug 2003

Fuzzy Logic Dot Net Fuzzy Word Experiment

, 19 Aug 2003
Rate this:
Please Sign up or sign in to vote.
A look at approaching words with fuzzy logic
<!-- Add the rest of your HTML here -->


The Fuzzy Word Experiment project is exactly what it says in the title in that it was conceived as an experiment to see if there was a way in which to deal with words in a fuzzy manner, seeing as words themselves can be somewhat of a fuzzy area themselves I thought that they might lend themselves quite well to being given the Fuzzy treatment.


The first idea was to get a value for the word and then use this as it's fuzzy value pretty much in the same way that you would use a fuzzy number ( see earlier articles for Fuzzy Number explanation ) with the value being the Number of the word and a Minimum and a Maximum value being applied so that it would be possible to see if words matched.

double dValue = 0;
for( int i=0; i<strWord.Length; i++ )
    dValue += strWord[ i ];

The code above is the section of code that assigns a value to the word, this is simply an addition of the ASCII values for the letters. At on point they were also multiplied by there position within the word but this did nothing to get around the problem of different words arriving at the same number, it helped slightly but it didn't fix it.

e.g. The word glass gives a reading of 

Fuzzy Word = glass
Maximum = 548
Minimum = 528
Membership = 1
Compared membership = 0
Number = 538
ID = 0
Name =

The important bit in the readout is the Number variable which gives a reading of 538. When we compare this to the word "sheep" we hit a problem

Fuzzy Word = sheep
Maximum = 543
Minimum = 523
Membership = 1
Compared membership = 0
Number = 533
ID = 0
Name =

As you can see the number for the word "sheep" is 533 which means that no matter which way the comparison between the two words is done the values always fall within the minimum and maximum ranges of the other word giving them an instantly high likely hood that if just using numbers is involved then the two words are going to be considered almost identical by the code running the comparison

The second idea is based on the actual letters in the word and it goes something like this if there are a certain number of letters in the word that are in the correct place then there is a high likely hood that the person who typed the words is aiming at the same word and maybe made a typo. So count the number of letters in the word that match, using,

int nCount = 0;

for( int i=0; i<Word.Length; i++ )
    if( i < Word.Length && i < comparison.Length )
        if( Word[ i ] == comparison[ i ] )

Which gives the number of letters that are in the correct place. But this then leads to problems with the length of the words. Say you have two words one is a short word say four or five letters long and the other is a long word but the long word contains the short word. If the short word was the Fuzzy Word object then the code above would let the long word through. In order to help solve this problem the code deals with Maximum Incorrect Letters and Minimum Matching Letters. Unfortunately the concept of having a minimum number of matching letters doesn't quite hold up all on it's own. The problem is that some words are just plain short and if this value is set too high then short words are going to be automatically rejected. So for this reason this value has to be kept fairly low and in the code it defaults to 3.

The idea of having a maximum number of incorrect letters can help out though in the fact that as long as the word has a certain amount of letters correct then there are so many that can be wrong. Any more than that and the word is automatically rejected. Once again though the maximum number of incorrect letters cannot be made too high as it would then block out short words completely.

And this is where we come back to the number value for each word. At this point we have a word that has no more than a few incorrect letters but enough correct letters to get through so what do we do with them because we still need to give it some sort of Membership value that will express the word as a comparison with the word we are testing against. This is done through the code,

FuzzyWord temp = new FuzzyWord( comparison );

/// set the membership value

if( dValue >= this.Number - nValueDifference &&
       dValue <= this.Number + nValueDifference )
    temp.ComparedMembership = 1.12;
else if( dValue >= this.Number - ( nValueDifference * 2 ) &&
       dValue <= this.Number + ( nValueDifference * 2 ) )
    temp.ComparedMembership = 1.25;
else if( dValue >= this.Number - ( nValueDifference * 3 ) &&
       dValue <= this.Number + ( nValueDifference * 3 ) )
    temp.ComparedMembership = 1.37;
else if( dValue >= this.Number - ( nValueDifference * 4 ) &&
        dValue <= this.Number + ( nValueDifference * 4 ) )
    temp.ComparedMembership = 1.50;
else if( dValue >= this.Number - ( nValueDifference * 5 ) &&
        dValue <= this.Number + ( nValueDifference * 5 ) )
    temp.ComparedMembership = 1.63;
else if( dValue >= this.Number - ( nValueDifference * 6 ) &&
        dValue <= this.Number + ( nValueDifference * 6 ) )
    temp.ComparedMembership = 1.75;
    temp.ComparedMembership = 1.87;

Which compares the value of the word to be compared against the number plus or minus a preset value called nValueDifference which is set to 10 by default. This then allows us to arrive at a conclusion based on the comparison which is then returned by the code to the calling program in the ComparedMemerbership member.

Of course this doesn't prevent words like "duck" and "suck" being considered almost identical, which is a good thing as they are almost identical but it does differentiate between them so that when compared to "duck", "suck" will not be returned with a compared membership of 1.0 which indicates a direct match.

Testing The Class

There are two applications provided with the sample code. The first is used to compare words directly to each other and see if there is a match and the second will process a text file search for words specified before hand.

The Fuzzy Word Test Program ( The FuzzyWordExperiment project ) compares the two words entered. In this case the word "sheep" with "Sheeps" ( I know it should be spelt Sheep's ) and gives the compared membership value for the two words. Remember when comparing words it is the Compared membership value after a comparison as words always have a membership value of 1.0.

The Show Word button will show the values for the word in the left hand edit box and the Compare button will call the Fuzzy Word Compare function using the two provided words.

The second Fuzzy Word Test application takes a txt file and parses the text line by line for the words entered in the Words to find drop down list box. Words can be added to the box by typing them in and clicking the add button and removed by selecting them and clicking on the remove button. The two drop down list boxes for the tolerance levels set the levels for the upper and lower tolerances these range from 0.12 - 0.87 and 1.12 - 1.87.

The Find The Words button opens the text file and reads it line by line comparing each word in the line to the words listed in the words to find  list. If a word is found in the line that is in the list then program outputs the find to the edit box and then proceeds on to the next line of text. It should be noted that we are not dealing with proper sentences yet only separate lines of text.

As you can see by looking at the image above in our search though the text file which in this case is the "Origin Of The Species" text file that is provided in the debug directory for the program so that you can play around with the sample code. Their is only one item that failed the tolerance level in the above image and that is the word dusky that has three letters that are included in the word duck and it fails because it has a tolerance level of 1.87 which is outside the set tolerance levels for this run of the program.

There are also more word specific aspects to the code that deal with some of the problems when encountering words. For example the class contains an array that holds punctuation characters so if you have the word duck followed by a comma then this will be selected as single word by the reading code but the comparison code will remove the comma. See the code for other punctuation characters that are included.

Also there is code added in order to allow for plurals of the words you are looking for but this has an optional parameter so if you want ducks to be treated as a completely separate word from duck then it will be.


For me at least this has been an interesting experiment that makes me wonder how far you can push the treatment of words before you have to start getting into a context situation, by which I mean before you have to start understanding what the words themselves mean. I have a number of thoughts on ways to take this but they are of the kind that may be far too stupid to work or far too hard to implement. I guess I'll just have to follow them through and see what happens.


  • 20 August 2003 :- Initial release.


The last article in the series contains the latest code for the library. No attempt at backward compatibility will be attempted and I will change the library as I see fit.

Link To Next Article

This is currently the latest article.


  • Tom Archer ( 2001 ) Inside C#, Microsoft Press
  • Jeffery Richter (  2002 ) Applied Microsoft .NET Framework Programming, Microsoft Press
  • Charles Peltzold ( 2002 ) Programming Microsoft Windows With C#, Microsoft Press
  • Robinson et al ( 2001 ) Professional C#, Wrox
  • Bart Kosko ( 1994 ) Fuzzy Thinking, Flamingo
  • Buckley & Eslami ( 2002 ) An Introduction To Fuzzy Logic And Fuzzy Sets, Physica-Verlag
  • Earl Cox ( 1999 ) The Fuzzy Systems Handbook, AP Professional


This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here


About the Author

United Kingdom United Kingdom
No Biography provided

You may also be interested in...

Comments and Discussions

QuestionApproach fails! Pin
cyberhead2-Apr-13 4:43
membercyberhead2-Apr-13 4:43 
GeneralFuzzy string match Pin
joel0025-Oct-07 15:17
memberjoel0025-Oct-07 15:17 
GeneralSome questions ... Pin
Sébastien Lorion1-Sep-03 20:07
memberSébastien Lorion1-Sep-03 20:07 
GeneralRe: Some questions ... Pin
Sébastien Lorion1-Sep-03 20:23
memberSébastien Lorion1-Sep-03 20:23 
I forgot to mention :

If you want to make a probabilistic comparaison of 2 strings, you could simply select a random letter position and see if they match, adding 1 to a count if they match or 0 otherwise. By repeating this operation x times and by choosing a good threshold for count, you could theorically obtain a comparaison result that is safer than comparing letter by letter if you take into account that a computer may enter a corrupt state for whatever reason (which is kinda funny).


Intelligence shared is intelligence squared.

Homepage :
GeneralRe: Some questions ... Pin
pseudonym672-Sep-03 4:07
memberpseudonym672-Sep-03 4:07 
GeneralTime Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparision Pin
ChipJust26-Aug-03 11:05
memberChipJust26-Aug-03 11:05 
GeneralThe plural of &quot;sheep&quot; is still &quot;sheep&quot; Pin
Marc Clifton22-Aug-03 2:21
editorMarc Clifton22-Aug-03 2:21 
GeneralRe: The plural of &quot;sheep&quot; is still &quot;sheep&quot; Pin
pseudonym6722-Aug-03 5:31
memberpseudonym6722-Aug-03 5:31 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

Permalink | Advertise | Privacy | Terms of Use | Mobile
Web02 | 2.8.170518.1 | Last Updated 20 Aug 2003
Article Copyright 2003 by pseudonym67
Everything else Copyright © CodeProject, 1999-2017
Layout: fixed | fluid