Click here to Skip to main content
15,892,674 members

Compare Strings for Percentage Match

JBenhart asked:

Open original thread
Hello all,

Ok, I am banging my head against the wall for a while now trying different techniques. None of them are working well.

I have two strings. I need to compare them and get an exact percentage of match,

ie. "four score and seven years ago" TO "for scor and sevn yeres ago"

Well, I first started by comparing every word to every word, tracking every hit, and percentage = count \ numOfWords. Nope, didn't take into account misspelled words. ("four" <> "for" even though it is close)

Then I started by trying to compare every char in each char, incrementing the string char if not a match (to count for misspellings). But, I would get false hits because the first string could have every char in the second but not in the exact order of the second. ("stuff avail" <> "stu vail" (but it would come back as such, low percentage, but a hit. 9 \ 11 = 81%))

SO, I then tried comparing PAIRS of chars in each string. If string1[i] = string2[k] AND string1[i+1] = string2[k+1], increment the count, and increment the "k" when it doesn't match (to track mispellings. "for" and "four" should come back with a 75% hit.) That doesn't seem to work either. It is getting closer, but even with an exact match it is only returns 94%. And then it really gets screwed up when something is really misspelled. (Code at the bottom)

Any ideas or directions to go?

Thanks,

Josh


count = 0
j = 0
k = 0
While j < strTempName.Length - 2 And k < strTempFile.Length - 2
    ' To ignore non letters or digits '
    If Not strTempName(j).IsLetter(strTempName(j)) Then
        j += 1
    End If

    ' To ignore non letters or digits '
    If Not strTempFile(k).IsLetter(strTempFile(k)) Then
        k += 1
    End If

    ' compare pair of chars '
    While (strTempName(j) <> strTempFile(k) And _ 
           strTempName(j + 1) <> strTempFile(k + 1) And _ 
           k < strTempFile.Length - 2)
        k += 1
    End While
    count += 1
    j += 1
    k += 1

End While

perc = count / (strTempName.Length - 1)
Tags: Visual Basic

Plain Text
ASM
ASP
ASP.NET
BASIC
BAT
C#
C++
COBOL
CoffeeScript
CSS
Dart
dbase
F#
FORTRAN
HTML
Java
Javascript
Kotlin
Lua
MIDL
MSIL
ObjectiveC
Pascal
PERL
PHP
PowerShell
Python
Razor
Ruby
Scala
Shell
SLN
SQL
Swift
T4
Terminal
TypeScript
VB
VBScript
XML
YAML

Preview



When answering a question please:
  1. Read the question carefully.
  2. Understand that English isn't everyone's first language so be lenient of bad spelling and grammar.
  3. If a question is poorly phrased then either ask for clarification, ignore it, or edit the question and fix the problem. Insults are not welcome.
  4. Don't tell someone to read the manual. Chances are they have and don't get it. Provide an answer or move on to the next question.
Let's work to help developers, not make them feel stupid.
Please note that all posts will be submitted under the http://www.codeproject.com/info/cpol10.aspx.



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900