13,735,232 members
Tip/Trick
alternative version

#### Stats

26.8K views
17 bookmarked
Posted 16 Dec 2013
Licenced CPOL

# Levenshtein Edit Distance Algorithm

, 16 Dec 2013
Implement the Levenshtein Edit Distance algorithm.

## Introduction

This source code is for understanding the Levenshtein Edit Distance algorithm easily. When I began to search this algorithm, I was so scared to implement it. I read all of the articles about this algorithm and I really didn't understand anything. When my research was deepening, I was trying to figure out the secret of this common algorithm. When I finished implementing it, I decided to write an article on CodeProject.com. I am hoping my implementation would be useful for people who work to understand The Levenshtein Edit Distance algorithm.

## Background

This algorithm is so commonly used for Microsoft Outlook or Microsoft Word  Spell Checking systems and Google. So thanks Vladimir Levenshtein for this algorithm. I will not try to explain the algorithm's details to you. The purpose of this article is for you to follow steps and see the flow chart of algorithm easily.

The edit distance between two strings is defined as the minimum number of edit operations required to transform one string into another. Let’s assume that the first string is named as the target string and the second string is named as the source string. We want to convert the source string to target. An edit is one of three operations: a delete (a character from the source string), an insert (a character from the target string), and a substitute (a character from the source string with a character from the target string). There is a fourth operation, named as match, which does not count as an edit. Consider two input strings "activate" and "caveat". Below you can see one possible transformation. In the example, a transformation is represented by a string consisting of the characters M for match, S for substitute, D for delete, and I for insert.

D M D S M I M M D

a  c   t  i   v    a  t   e

c      a  v e a  t

Figure 1: Levenshtein Edit Distance algorithm GUI

## Using the code

I will share some code snippets of my implementation here:

In our implementation:

• Delete cost = 0.7
• Insert cost = 0.7
• Substitute cost = 1.0

Match does not have a cost.

```Reference = Ref_text_box.Text;
Hypotheses = Hyp_text_box.Text;

distance_table = new double[Reference.Length + 1, Hypotheses.Length + 1];
for (int i = 0; i <= Reference.Length; i++)
distance_table[i, 0] = i * 0.7;

for (int j = 0; j <= Hypotheses.Length; j++)
distance_table[0, j] = j * 0.7;

for (int i = 1; i <= Reference.Length; i++)
{
for (int j = 1; j <= Hypotheses.Length; j++)
{
if (Reference[i - 1] == Hypotheses[j - 1])//if the letters are same
distance_table[i, j] = distance_table[i - 1, j - 1];
else //if not add 1 to its neighborhoods and assign minumun of its neighborhoods
{
distance_table[i, j] = Math.Min(Math.Min(distance_table[i - 1, j - 1] + 1,
distance_table[i - 1, j] + 0.7), distance_table[i, j - 1] + 0.7);
}
}

}
//create table
//Compute Distance
String distance_string = " ";
String distance_result;
int i = Reference.Length, j = Hypotheses.Length;

while (true)
{
this.dataGridView1.Rows[i].Cells[j+1].Style.BackColor = Color.LightGreen;

if (i == 0 && j == 0)
{
this.dataGridView1.Rows[i].Cells[j].Style.BackColor = Color.LightGreen;

distance_string_textbox.Text = distance_string;
char[] a = distance_string.ToCharArray();
Array.Reverse(a);
distance_result = new string(a);
distance_string_textbox.Text = distance_result;
break;
}
else if (i == 0 && j > 0)
{
this.dataGridView1.Rows[i].Cells[j+1].Style.BackColor = Color.LightGreen;
distance_string += "d";//delete
j--;
}
else if (i > 0 && j == 0)
{
this.dataGridView1.Rows[i].Cells[j+1].Style.BackColor = Color.LightGreen;
distance_string += "i";//insert
i--;
}
else
{
if (distance_table[i - 1, j - 1] <= distance_table[i - 1, j] &&
distance_table[i - 1, j - 1] <= distance_table[i, j - 1])
{
this.dataGridView1.Rows[i].Cells[j+1].Style.BackColor = Color.LightGreen;
if (distance_table[i - 1, j - 1] == distance_table[i, j])
distance_string += "m"; //match
else
distance_string += "s"; //substitue

i--;
j--;
}

else if (distance_table[i - 1, j] < distance_table[i, j - 1])
{
this.dataGridView1.Rows[i ].Cells[j+1].Style.BackColor = Color.LightGreen;
distance_string += "i"; //insert
i--;

}
else if (distance_table[i, j - 1] < distance_table[i - 1, j])
{
this.dataGridView1.Rows[i].Cells[j +1].Style.BackColor = Color.LightGreen;
distance_string += "d"; //delete
j--;
}
}
}```

## Points of Interest

This code snippet will teach to understand the Levenshtein Edit Distance algorithm easily. You can try some words and sentences to compare each other and you can follow the distance table to understand how to compute the distance of your strings.

## Share

 Student University Of Kocaeli Turkey
I am student University of Kocaeli and my department is Electronics and Telecommunication Engineering. It's my last year. So my interests are software developing and hardware developing for embedded platform.

## You may also be interested in...

 First Prev Next
 Bug-Report Mr.PoorEnglish1-Sep-17 21:28 Mr.PoorEnglish 1-Sep-17 21:28
 Different Costs 6-Sep-16 4:37 6-Sep-16 4:37
 Value 0.7 or 1? coder_spree1-May-15 1:15 coder_spree 1-May-15 1:15
 Why 0.7 cost? 10BRG23-May-14 3:34 10BRG 23-May-14 3:34
 Re: Why 0.7 cost? furkanavcu23-May-14 10:02 furkanavcu 23-May-14 10:02
 Re: Why 0.7 cost? furkanavcu7-Feb-15 3:49 furkanavcu 7-Feb-15 3:49
 Re: Why 0.7 cost? 10BRG3-Mar-15 1:49 10BRG 3-Mar-15 1:49
 other algorithm Robert Frank18-Dec-13 17:23 Robert Frank 18-Dec-13 17:23
 Last Visit: 31-Dec-99 18:00     Last Update: 18-Oct-18 12:13 Refresh 1