Click here to Skip to main content
15,892,059 members
Please Sign up or sign in to vote.
1.00/5 (1 vote)
See more:
Hello,

I am using sqlite to store data for a program that tracks TV show info. I am having problems matching the users info to the official episode titles. The users information could be misspelled or completely incorrect. I need a way to return the best match to the end users potentially partially incorrect title names. Aka, I can't do a straight out phrase matching system.

Being basically a beginner at SQL, I am not sure how to do a fuzzy logic match with incorrect info. I know you can do like and wildcards but how can that work with a string? Or do I do split the title into words and then search that way? Any help would be greatly appreciated!

Thanks!
Posted
Comments
Sergey Alexandrovich Kryukov 15-Jun-13 0:46am    
What you say has nothing to do with fuzzy search and is very naive. Do you seriously think that one or few Quick Answers can help you get started? Do you seriously think that you can approach this problem at SQL level, especially an SQL beginner? To start with, are you familiar with the very basics, such as fuzzy set and perhaps fuzzy logic? This is very non-trivial topic.
—SA
JBenhart 15-Jun-13 2:15am    
Ah, SA, still being insulting to those with honest questions? I hope you don't respond to every wayward developer in this way. Calling someone naive and being demeaning isn't helpful. Obviously, I do believe someone with a friendly ear and a little patience would be able to help me or I wouldn't have posted my question here, as it has helped in the past. And since my level of SQL is somewhat limited, I wouldn't know if there was an easy solution or not now would I? It could be a simple SQL procedural call or even a basic where clause. That is why I asked. So instead of insulting an honest, if albeit "naive", question, maybe you could take the stance of trying to be helpful.

To start with, it would be good to understand the basics of fuzzy sets and fuzzy logic. I would say, the prerequisites for that would be good understanding of "classical", non-fuzzy set theory and logic.

I hope you can read and understand those articles and perhaps some of the references from these articles, at least to understand how non-trivial these fields are:
http://en.wikipedia.org/wiki/Fuzzy_set[^],
http://en.wikipedia.org/wiki/Fuzzy_logic[^].

These days, software based on fuzzy mathematics is growing, including search algorithms:
http://en.wikipedia.org/wiki/Approximate_string_matching[^],
http://en.wikipedia.org/wiki/Levenshtein_distance[^].

See also: http://ntz-develop.blogspot.com/2011/03/fuzzy-string-search.html[^].

You can get an idea of fuzzy matching based on Levenshtein distance (see above) from these CodeProject articles:
http://www.codeproject.com/Articles/162790/Fuzzy-String-Matching-with-Edit-Distance[^],
http://www.codeproject.com/Articles/36869/Fuzzy-Search[^].

You can find a lot more: http://bit.ly/17Nyzzj[^].

I would not hope for asking quick questions and getting quick answers on forums, as well as short cookbook recipes and ready-to-use solutions. Pretty serious education is required, even if it could be self-education, still education.

—SA
 
Share this answer
 
v2
Comments
JBenhart 15-Jun-13 2:18am    
I have used the levenshtein distance algorithm in my code before but it wasn't a part of SQL directly. I thought there may be an easier SQL call to search. I am mistaken.
JBenhart 15-Jun-13 2:26am    
And the lmgify link was ever so helpful. Gosh, why didn't I think of that? #sarcasmimplied
Sergey Alexandrovich Kryukov 15-Jun-13 22:29pm    
Just the fact you put your own "answer" and even self-accepted it tells me a lot. This is fake, my friend, and you are a cheater.
—SA
Monjurul Habib 16-Jun-13 15:56pm    
bundle of links :) 5+
Sergey Alexandrovich Kryukov 16-Jun-13 22:36pm    
Thank you, Monjurul.
—SA
It seems I will just have to search through the data in vb since I've been informed, ever so precisely, that I cannot do it in SQL.
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900