Click here to Skip to main content
15,886,078 members
Please Sign up or sign in to vote.
2.11/5 (6 votes)
How to get alike letters from a paragraph which occur more than one in a paragraph in c# win. form.

Example:-"My name is jonathan. I prefer science than humanties."

The above example should give output than & ie as it appears twice in the paragraph. [minimum match for letters should be 2 or more letters]
Can somebody give me an idea how to achieve this?



Thanks in Advance
Posted
Updated 9-May-14 4:30am
v3
Comments
[no name] 9-May-14 10:24am    
And what is with "ie" which occurs in science and humanties?
agent_kruger 9-May-14 10:30am    
nice observation sir, updated the question.
Sergey Alexandrovich Kryukov 9-May-14 10:50am    
Besides, "I prefer science than humanties" is not correct usage. "Prefer" does not assume comparison, so it is incompatible with "than". "Humanties" is misspelled. If this is supposed to be a tool for improving the writing, it won't really help much. :-)
—SA
agent_kruger 10-May-14 0:34am    
sorry sir, for the wrong spelling but sir, do you have the knowledge how to achieve this?
Sergey Alexandrovich Kryukov 10-May-14 19:07pm    
Please, no need to apologize; I just wanted to help you with usage/spelling. I don't think the problem is too difficult, but solving it with reasonable efficiency needs some thinking...
—SA

You need to write a string parser that looks for similar sequences in each word. So you start by extracting the various sets from the first word in the sentence, and search for all other occurrences in the remaining words. Keep a note and count of all matches. Then repeat the process for all the other words in the sentence until you have processed everything.
 
Share this answer
 
v3
Comments
agent_kruger 10-May-14 0:39am    
sir, do you mean that i have to loop each char. and check if the paragraph contains more than 1 occurrences.
Richard MacCutchan 10-May-14 3:17am    
Yes; after all, how else could you find all occurrences?
agent_kruger 10-May-14 4:43am    
sir suppose a word is "pneumonoultramicroscopicsilicovolcanoconiosis" then it will check all 45 letters but how will it check if 2 or more letters are same here?
Richard MacCutchan 10-May-14 5:09am    
Start with 'p' and compare it against all the remaining letters. Next go to 'n' and check it against all the remaining letters. Repeat until you have checked them all.
agent_kruger 10-May-14 6:45am    
sir, that is ok but what about i have to check it with more than 1 letters like in the above word "si" appears twice
This looks like a job for Suffix Trees[^].
A book reference is: Algorithms on Strings, Trees, and Sequences[^]
 
Share this answer
 
Comments
[no name] 9-May-14 12:59pm    
That is a good hint, my 5
Matt T Heffron 9-May-14 13:02pm    
Thanks
agent_kruger 10-May-14 0:37am    
sir, can you give me an example how to use your "Suffix Trees"?

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900