Click here to Skip to main content
15,885,546 members
Please Sign up or sign in to vote.
1.00/5 (1 vote)
See more:
Hello Brother, Sister, Buddy, Programmer, Master. :) :) :)

Based on running time for searching, this code take its too long.
Please give recommendation for making it efficient.

How to change or improve my code like this :

C#
static List<string> getDBList(string DBname)
{
     List<string> listWords = new List<string>();
     string[] files;

     try
     {
         files = Directory.GetFiles(@"dbase/", DBname); 
         foreach (string file in files)
             foreach (string line in File.ReadAllLines(file))//doubt
                listWords.Add(line.Trim().ToUpperInvariant());
     }
     catch (Exception ex)
     {
         Console.WriteLine(ex.ToString());
         return new List<string> { };
     }

     return listWords;
}


Then...

C#
//MAIN PROGRAM
string allInput = rtbInput.Text;

List<string> splitString = new List<string>.Split(new char[] { ' ', '\t', etc...});
List<int> AllIndexes = new List<int>();
HashSet<string> nounList = new HashSet<string>(getDBList("nounList.txt"));//doubt

int startIndexes = 0;

foreach (string s in splitString)
{
    if (s.Trim() != "")
    {
       string word = s.Trim();

       if(!(nounList.Contains(word.ToUpperInvariant())))   //doubt, if not found, color it
       { 
               tbTest.Text += word + " ";

               //index to begin color the text
               AllIndexes = WordsIndex(word, startIndexes);

               foreach (int item in AllIndexes) //Coloring all appearance of the word.
               {
                   tbSeeIndex.Text += Convert.ToString(" " + item + " ");

                   rtbInput.Select(item, word.Length);

                   startIndexes = item + word.Length;

                   rtbInput.SelectionColor = Color.Red;
              }

              tbL.Text += Convert.ToString(" " + startIndexes + " ");
        }
    }
}            


in nounList (90963 word), content like this example :
book
chari
pencil
table
etc...

Please help me.....

Thanks a lot. :) :) :) Cheers
Posted
Updated 12-Mar-13 21:23pm
v8

You are not getting the idea. Dictionaries are used for speed, yes. But for speed of what? The speed of finding a value by unique key, the computational time complexity of this search is O(1), that's is, it asymptotically does not depend on the number of items.
Please see:
http://en.wikipedia.org/wiki/Computational_complexity_theory[^],
http://en.wikipedia.org/wiki/Communication_complexity[^],
http://en.wikipedia.org/wiki/Big_O_notation[^].

If you simply read the lines of a file, you can use them as vales, as keys, but not as key-value pairs used in dictionary. And if your keys and values are the same, it gives you nothing, because if you know they are the same and have a key, why looking for a value? (However, it will guarantee key uniqueness, but for this purpose, you should use the class System.Collections.Generic.HashSet<T>.)

This way, your question simply makes no sense. If you explain what are you going to do with a dictionary (first of all, to yourself), I would gladly help. :-)

—SA
 
Share this answer
 
Comments
Berry Harahap 13-Mar-13 0:26am    
Dear Sergey Alexandrovich (SA)...
I need Dictionary for faster searching. After i created Dictionary, i wanna check definite string (input) if this string found or not in Dictionary. I think it faster than List.
What do you think, Sir?
Sergey Alexandrovich Kryukov 13-Mar-13 1:18am    
I already explained everything. I think that you still did not get it. You cannot compare List and Dictionary, they are not different implementations, they have different meaning. You cannot use Dictionary instead of List just as is. Please read my answer more carefully, as well as the help page on System.Collections.Generic.Dictionary<>. Dictionary element is key-value, List element is just value.
You need to decide on the following:
1) what search criteria do you need?
2) what should be your key?
3) what should be your value?
And, after all, you can read documentation and finally understand the idea by yourself. Please try.
And when you understand it all, please accept my answer formally (green button).
Good luck,
—SA
Berry Harahap 13-Mar-13 1:47am    
Sir SA, i get the point, what do you think about my code above (question), i have improve the question. :) :) :) Cheers Sir. :)
Sergey Alexandrovich Kryukov 13-Mar-13 1:58am    
Ok, at first glance it should work, but it only ensures uniqueness of all items. You cannot accelerate anything, because, as I say, you cannot search by key...
What is exactly your purpose? Not performance, code functionality?
—SA
Berry Harahap 13-Mar-13 3:22am    
Sir @Sergey Alexandrovich Kryukov (SAK)..
I've modified my code like my question now (above) :).
Please give your suggestion and recommendation, because my code have a long time.
:) :) :) Cheers
Yeah, I don't see how a Dictionary would help with that.

Plus, why bother with an array? Lists are waaaaay better.

I also recommend not using ReadAllLines if the files may be very large.


If you want to use ReadAllLines, have you considered making a List of the arrays and then combining them?

One thing to consider is that if you have only the one file then iterating it into a List then back out to an array is not good technique.
 
Share this answer
 
Comments
Berry Harahap 13-Mar-13 0:49am    
@PIEBALDconsult.. Nice suggestion. Please let me learn by your example please. :) :) :)
Perhaps, about HashSet or what's option to improve ReadAllLines? :) Cheers
PIEBALDconsult 13-Mar-13 1:04am    
What is it you are really trying to do? From your use of a list of nouns can I infer that you might want a Spell Check Tree?
Berry Harahap 13-Mar-13 1:27am    
Sir PIEBALDconsult, yes Sir, for spelling error. My assignment Sir. :) :) :)
Please give me recommendation, please........
PIEBALDconsult 13-Mar-13 9:52am    
Then the easy answer is to use a Hashset (or database).

When I was in college, one of the assignments (in C) was to write something like this:
http://en.wikipedia.org/wiki/Patricia_tree[^]
Berry Harahap 13-Mar-13 21:39pm    
Thanks very much Sir. :) :) :)

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900