
Introduction
My father, who calls himself a �fact collector�, has been collecting facts relating to a wide selection of topics for over 20 years now. He has been storing all of his text in several text files, and each quote or article he stores, he stores in a plain text file, delimited by two asterisks on a line by itself, as follows:
Hello, I am fact number 1.
**
Hello, this is the second fact.
This second fact is second only to the first fact,
which is fact number 1.
**
This is fact number three.
These files can grow to be up to 10 megs in size, and he has over 200 megs worth of these files. He wrote a quick and dirty VB app last year to access these files, but the search process was extremely slow, so I rewrote the application in C# and offered it to him as an alternative, which he is now using.
The search system needed to be set up in a few ways.
- It must be able to search multiple files.
- The files must be able to be grouped by category. With 200 megs, you may only want to search a subset of the files, and not the entire database of files.
- Multiple categories should be able to be searched in one search.
There are a few objects in this code I will discuss.
NoteFileInfo Object
The NoteFileInfo
is the object that stores the categories, file locations, and the current category selections for the application. It is serializable, and can be stored on the hard drive. When the application is first started, it looks for default.nfb, and when it closes, it prompts to save to default.nfb. The object can be edited, categories and files added, by using the Edit FileBase menu selection on the menu. This window is pretty self-explanatory when used. This object uses Singleton pattern. The following is the serialization code for the NoteFileInfo
object:
#region serialization
public static void Load(string filename) {
if (new FileInfo(filename).Exists)
using (Stream st = (Stream)new FileStream(
filename, FileMode.Open))
_noteFileInfo =
(NoteFileInfo)new BinaryFormatter().Deserialize(st);
}
public static void Save(string filename) {
_noteFileInfo._dirty = false;
if (new FileInfo(filename).Exists)
new FileInfo(filename).Delete();
using (Stream st = (Stream)new FileStream(filename,
FileMode.OpenOrCreate))
new BinaryFormatter().Serialize(st, _noteFileInfo);
}
public static void Close() {
_noteFileInfo = new NoteFileInfo();
}
#endregion
NoteFile Object
The NoteFile
object is actually a collection of the strings within a particular file. For example, from the example of the file above, there would be three strings in the NoteFile
object. The code is simple, and does nothing more than stream read through the file and add the items to the list.
SearchResults Object
The SearchResults
object multithreads itself when running, so it can run in the background, and thus not tie up the UI during its running. It will send back events to the UI, however, to update the UI on its current status. Notice the �NoteFileInfo.getInstance()
� call to the singleton NoteFileInfo
class. The design of this class enables the UI to call the class using the Run()
and Stop()
methods, but still allows the class to place itself in another thread. This way, the UI can call the Stop
method at any time, and the class will take care of its own abort.
namespace Notes2007 {
class SearchResults : List<string> {
SearchTerms searchTerms;
Thread thread;
public delegate void ResultAdded(string result);
public event ResultAdded NewResult;
public delegate void NowSearching(string filename);
public event NowSearching Searching;
public delegate void Done();
public event Done Finished;
public SearchResults(SearchTerms searchTerms) {
this.searchTerms = searchTerms;
}
public void Run() {
ThreadStart ts = new ThreadStart(DoSearch);
thread = new Thread(ts);
thread.IsBackground = true;
thread.Start();
}
private void DoSearch() {
foreach (string filename in
NoteFileInfo.getInstance().ActiveFiles) {
GC.Collect();
GC.WaitForPendingFinalizers();
Searching(filename.Substring(filename.LastIndexOf(@"\") + 1));
NoteFile notefile = new NoteFile(filename);
foreach (string note in notefile) {
bool add = true;
foreach (string searchterm in searchTerms) {
if (!(note.ToUpper().Contains(searchterm.ToUpper())))
add = false;
}
if (add) {
this.Add(note);
NewResult(filename.Substring(
filename.LastIndexOf(@"\") + 1) + "|" + note);
}
}
}
Finished();
}
public void Stop() {
thread.Abort();
Finished();
}
}
}
SearchTerms Object
This is nothing more than a word parser, which parses the words searched and adds them to a list of strings. The object listed below also excludes all words shorter than three characters long, as well as the two common words �the� and �and�. The object will allow phrases as a single search term as well, by simply enclosing the phrase in quotes.
namespace Notes2007 {
class SearchTerms : List<string> {
public SearchTerms(string searchstring) {
List<string> ignoredterms = new List<string>();
ignoredterms.Add("the");
ignoredterms.Add("and");
bool inquote = false;
string searchterm = string.Empty;
foreach (char c in searchstring.ToCharArray()) {
if (c.ToString() == @"""") {
inquote = !inquote;
continue;
}
if ((c.ToString() == " ") && (!inquote)) {
if (!this.Contains(searchterm) &&
(searchterm.Length > 2) &&
(!ignoredterms.Contains(searchterm)))
this.Add(searchterm);
searchterm = string.Empty;
continue;
}
searchterm += c;
}
if (!this.Contains(searchterm)) {
this.Add(searchterm);
searchterm = string.Empty;
}
}
}
}
The general idea of the application is that the user simply enters a search string, and then the app will find all the articles or quotes that contain all the search terms within the search string.
- The user enters a search term, and presses Enter or clicks Go.
- The application creates a
SearchTerms
object, and feeds it into the SearchResults
object, which will then create NoteFile
objects based on the active files listed in the NoteFileInfo
object, and will compare each string in the NoteFile
object to the terms in the SearchTerms
object. If all the search terms are contained in the string, then the article will be added to the results list.
- The UI will update with a snippet (from the
Snippet
object) of the text, including the text file from which the article was pulled and a short string from the article including the search term that is highlighted in the right search terms pane. If another search term is selected, the snippets will update with new text containing the search term.
- When a search result is selected, the complete article text will appear in the bottom pane, and all the applicable search terms will be highlighted with various colors to indicate where they are in the file, and the scrollbar will attempt to move to the first occurrence of the search term in the article, for fast browsing.
I�ve started using this myself as a great way to store code I�ve accumulated from CodeProject and other locations for easy searching later on!