Click here to Skip to main content
Licence CPOL
First Posted 15 Jul 2008
Views 7,318
Downloads 119
Bookmarked 12 times

Searching on Text Files

By | 15 Jul 2008 | Article
This program is to search for words on text files.

Introduction

This is one of my projects that has a program to search on a text file. Assume that you have a set of text files stored somewhere in the hard disk. You want to find some text files, but you don't remember the file name. However, you know the content that you're looking for so that you have some keywords to search for. This is like the search function of Windows.

Background

Some of the requirements are:

  1. Create the FileList: Create a text file named FileList to store all of the text file paths. Each line of this file is a file path. Every line has an ID to identify the file path. The ID number starts at 0.
  2. Indexing: Scan all text files and store each word into a Binary Search Tree for searching quickly. Every node in the tree contains a word, a list of ID numbers, and left and right pointers.
  3. Display: Only output a little portion of the text files that contain the keywords and the ID to know which file was searched.

Using the Code

To create the FileList, I use the CStdioFile class:

// Create file
CStdioFile file;
file.Open("FileList.txt",CFile::modeCreate|CFile::modeReadWrite);

CFileFind Finder; // Find file path

BOOL bWorking = Finder.FindFile(m_PATH + "\\*.txt"); // Only file text files
while(bWorking)
{
    bWorking = Finder.FindNextFile();
    if (!Finder.IsDirectory())
    {
        file.WriteString(Finder.GetFilePath()); // Write file path
        file.WriteString("\n");
    }
}
file.Close();

For searching, I use a Binary Search Tree to store the words. Firstly, I scan the directory stores text files to create FileList. Then, open every text file in FileList to scan for words. Every word is stored in the BST. A word can have many IDs, so I use a Linear Linked List to store the ID numbers.

// Search word
ListID* CTinyGoogleDlg::SearchWord(string key)
{
    tree* current;
    ListID *tmp = NULL;
    
    // Find word
    if (head)
    {
        current = head;
        while (current)
        {
            if (strcmp(current->word,key) == 0)
                break;
            else
                if (strcmp(current->word,key) < 0)
                    current = current->right;
                else
                    if (strcmp(current->word,key) > 0)
                        current = current->left;    
        }
    }
    else
        MessageBox("Something's wrong!");
    
    // Return list of IDs
    if (!current)
        return tmp;
    else
        return current->IDs;
}

Then, ask the user to input keywords to search. Search on the Binary Tree to find whether the keywords exist or not. If yes, use the ID to open the text file. Then, print out some lines of the text file in the result.

// Display results
int CTinyGoogleDlg::Display(ListID *curr)
{
    CStdioFile file;
    CString sText;
    
    m_RESULT = "";
    if (curr)
    {
        if (file.Open("FileList.TXT",CFile::modeRead))
        {
            int count = -1;
            while (curr)
            {
                // Find file path to open text file by checking ID
                CString path;
                do
                {
                    file.ReadString(path);
                    count += 1;
                }while(count < curr->ID);
                
                CString DocID;
                DocID.Format("%d",curr->ID);
                m_RESULT = m_RESULT + "\r\nDocID:" + DocID + "\r\n";
                
                // Open file and display a part of paragraph
                CStdioFile read;
                read.Open(path,CFile::modeRead);
                for (short nLineCount = 0; nLineCount < 16; nLineCount++)
                {
                    read.ReadString(sText);
                    m_RESULT = m_RESULT + sText + "\r\n";
                }
                
                // Set lines in edit
                GetDlgItem(IDC_EDIT_RESULT)->SetWindowText(m_RESULT);
                read.Close();
                curr = curr->next;
            }
        }
        file.Close();
    }
    else
        MessageBox("NOT FOUND!");
    return 0;
}

Points of Interest

In the beginning, I met with some trouble on how to find the file paths. This wasn't very difficult, but at my level, it's not very easy. However, I found some ways on the Internet, and CodeProject helped me very much. Now, I am sharing my little program with others.

History

The first version of this program was written as a Win32 console app. This version is an MFC app.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

julyhoping

Instructor / Trainer
MySoft
United States United States

Member

A student of Vietnam University of Science and Portland State University.

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board. (secure sign-in)
 
Search this forum  
 FAQ
    Noise  Layout  Per page   
  Refresh
GeneralA word of advice PinmemberDaTxomin23:53 15 Jul '08  

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

Permalink | Advertise | Privacy | Mobile
Web01 | 2.5.120517.1 | Last Updated 15 Jul 2008
Article Copyright 2008 by julyhoping
Everything else Copyright © CodeProject, 1999-2012
Terms of Use
Layout: fixed | fluid