Click here to Skip to main content
15,896,201 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
I want to ask if anyone will help me, I have a task to use Levenshtein Distance, but distance for some composite letters like SH, DH, ZH, should be as a single letter, Well distances for these letters when used must be 1 not 2 when calculates levenshtein distance.
Posted
Comments
Sascha Lefèvre 16-Apr-15 18:33pm    
And how do you know if it's a composite or just two letters that are adjacent by chance? Or can this not happen in your language?
aras89 16-Apr-15 18:46pm    
n my Language these letters are always composite letters and do not change during use in different words
PIEBALDconsult 16-Apr-15 18:49pm    
Write a custom enumerator and maybe a class/struct that takes that into account.

1 solution

There's a very easy solution for this and you don't have to change the Levenshtein-algorithm at all: Before you calculate the distance, replace those composites with a single character that won't appear anywhere else: Control characters. Eg. ASCII 1, 2, 3.
 
Share this answer
 
Comments
Sergey Alexandrovich Kryukov 16-Apr-15 18:56pm    
Come to think about, not a bad idea, my 5.
—SA
Sascha Lefèvre 16-Apr-15 19:02pm    
Thank you! :)
aras89 16-Apr-15 18:57pm    
Thank you very much for this solution I will try this.
Sascha Lefèvre 16-Apr-15 19:02pm    
You're welcome!

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900