Click here to Skip to main content
15,891,136 members
Please Sign up or sign in to vote.
1.00/5 (1 vote)
See more:
Hi,

Let assume string1 = harlishr and having list of strings harish, reddy, kiran, aswanth, veeresh...
Now what i want is compare the string1 with all the strings in the list and return the nearest match for the string1 (i.e harish in example)

I tried LevenshteinDistance algorithm but it is different we need to pass two strings and it will return how many characters changed, but in my case it is totally different.

Can anyone explain how can we do that in C#4.0.



Thanks,

Harish
Posted

Hello I think that the LevenshteinDistance algorithm could be the solution to your problem. let me explain:

the default implementation of the algorithm:
C#
static class LevenshteinDistance
{
    /// <summary>
    /// Compute the distance between two strings.
    /// </summary>
    public static int Compute(string s, string t)
    {
        int n = s.Length;
        int m = t.Length;
        int[,] d = new int[n + 1, m + 1];

        // Step 1
        if (n == 0)
        {
            return m;
        }

        if (m == 0)
        {
            return n;
        }

        // Step 2
        for (int i = 0; i <= n; d[i, 0] = i++)
        {
        }

        for (int j = 0; j <= m; d[0, j] = j++)
        {
        }

        // Step 3
        for (int i = 1; i <= n; i++)
        {
            //Step 4
            for (int j = 1; j <= m; j++)
            {
                // Step 5
                int cost = (t[j - 1] == s[i - 1]) ? 0 : 1;

                // Step 6
                d[i, j] = Math.Min(
                    Math.Min(d[i - 1, j] + 1, d[i, j - 1] + 1),
                    d[i - 1, j - 1] + cost);
            }
        }
        // Step 7
        return d[n, m];
    }
}


next a simple button with some kind of logic:

C#
private void button1_Click(object sender, EventArgs e)
       {
           string baseString = "Basic";
           List<string> lstStringsToCheck = new List<string>
                                            {
                                                "Bas",
                                                "Test",
                                                "Prod",
                                                "Basist",
                                                "Bar",
                                                "Result",
                                                "Another string"
                                            };
           Dictionary<string,int> resultset = new Dictionary<string, int>();
           foreach (string stringtoTest in lstStringsToCheck)
           {
              resultset.Add(stringtoTest,LevenshteinDistance.Compute(baseString, stringtoTest));
           }
           //get the minimum number of modifications needed to arrive at the basestring
           int minimumModifications = resultset.Min(c=>c.Value);
           //gives you a list with all strings that need a minimum of modifications to become the
           //same as the baseString
           var closestlist = resultset.Where(c => c.Value == minimumModifications);
       }


as you can see I start with a baseString. this is in your example "harlishr". next I create a list with all possible strings.
further I've created a dictionary where I'll save the result from the LevenshteinDistance.compute method.
foreach string in my list I compute the distance and add this to the result dictionary.
afterwards I take the minimum value.
I select all the results that have this minimum value (this is the string or list of strings that is/are the closest to the original)

hope this helps

Kind regards
 
Share this answer
 
Comments
Harish Reddy K 12-Jul-12 4:58am    
Thanks for helping me.
it helps alot.
Tbone1983 12-Jul-12 8:35am    
thanks for the feedback
please don't forget to rate the solution so it could help other people
Member 13699819 5-Oct-23 6:22am    
I'm very late in finding this which seems the exact thing I need. However I've used a code translator to get into VB.Net & can't figure out the correct syntax of this line "int minimumModifications = resultset.Min(c=>c.Value)" which translated to "Dim minimumModifications As Integer = resultset.Min(c=>c.Value)" error is c is undeclared. What would be the corrected code?
Hi,

See my Extended search method[^]Tip/Trick

Hope this will solve your problem

thanks
-Amit
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900