Click here to Skip to main content
Click here to Skip to main content

DiscoUtils: Find All Files (non-recursive, easy-to-use file finder)

, 25 Nov 2005
Rate this:
Please Sign up or sign in to vote.
A re-usable file finder algorithm using hashtables.

FindAllFiles - DiscoUtils client

Note: The demo project includes a .msi which will install all the source files and the demo .exe to your \program files\FindAllFiles\ directory.

The source download is smaller and only contains the source files for the demo and the filefinder assembly.

Introduction

This non-recursive file finder uses a Hashtable to create a list of files (filenames with paths), which match your search criteria.

Background

I needed a good way to search a path and all of its subdirectories for all files which contained specific characters / strings. Later, while writing a utility to examine all running processes, I found that I also wanted to search all of my logical drives for the location of executable files so I could gather information about those processes.

DiscoUtils

I wrapped the file find utility (Discover Files) in its own assembly named DiscoUtils (discovery utilities).

It's a marketing thing... Smile | :)

Using the code

The overloaded public method

public static void CreateFileList(string p_TargetDir, string p_FilePattern, 
 ref Hashtable p_htDirs, ref Hashtable p_htFiles, bool p_SearchSubdirs)

public static void CreateFileList(string p_TargetDir, string [] p_FilePattern, 
 ref Hashtable p_htDirs, ref Hashtable p_htFiles, bool p_SearchSubdirs)

The only difference between the methods is the second parameter (string in the first method, string array in the second).

The overloaded private method

private static void IterateDirectories(string p_TargetDir, string p_FilePattern, 
 ref Hashtable p_htDirs, ref Hashtable p_htFiles)

private static void IterateDirectories(string p_TargetDir, string []  p_FilePattern, 
 ref Hashtable p_htDirs, ref Hashtable p_htFiles)

Same difference for the private method (string in the first method, string array in the second)*.

* I've always wanted to use the oxymoron, 'same difference', and this is the perfect place for that.

How to use

First, I'll list the steps for using this assembly/library in your .NET project, then I'll highlight a few of the interesting points in the code.

Quick Steps

Basically, one simple method and three steps:

  1. Setup up the parameters for sending in (notice that the hashtables are ref variables).
  2. Call the CreateFileList method.
  3. Iterate through the sent in FileListHashtable, which is now filled with a list of files (and their paths) which match your p_FilePattern.

Detailed Steps

  1. Add a reference to the DiscoUtils library in your project.
  2. Create two Hashtable objects for use in calling the GetFileList method:
    • one Hashtable which will represent a list of all the directories found.
    • another Hashtable which will represent a list of all the files found.
  3. Set the file pattern that you want to search for, or a list of files or file patterns you want to search for.
    • If you want to search for multiple types of files, you can add all of the patterns you like to an array of strings and then call the overloaded CreateFileList method.
      • Call the main method -- (finds all text files):
        string SearchPattern = "*.txt";
      • Call the overloaded (string array) method -- (finds all doc files and any file named MyFiles.txt or trash.xml):
        string [] arySearchPattern = {"MyFiles.txt", "*.doc", "trash.xml"};
  4. Set the path (p_TargetDir) that you want to search (see Known Issues below).
  5. Set the bool variable to decide whether or not you want to search all subdirectories too -- watch out! If your path is the root this will search your entire drive (see Known Issues below).

Using the FindAllFiles sample program

The FindAllFiles sample program (shown in the image at the top of this article) is a working example of how to use the DiscoUtils.CreateFileList() method.

To use the program, follow these steps:

  1. Compile the FindAllFiles source code.
  2. Set the File Pattern edit box to the types of files you want to search for (default value: *.txt).
  3. Set the Search Path edit box to a valid path (invalid paths will return no files and may look as if the program is failing). (Default value: c:\.)
  4. Click the Search Subdirs checkbox if you want to search subdirectories (default value: unchecked (don't search subdirectories)).
  5. Click the Start Search button.

Points of Interest

Eschew Recursion

1991 Microsoft QuickC, Memory Models and Stack Overflow

Back in 1991, when I was first learning to program using MS QuickC, I wrote a program which implemented a recursive method. I can't remember what the program was supposed to do, but I do remember that I got an "out of stack memory error". That was back in the days of memory models (where you had to decide how much memory you would allocate to your stack). I didn't understand that when my method called itself that it had to place all of those variables in memory and then re-generate the variables again.

I was traumatized by the whole incident.

Now, I understand the problem, but I'm still annoyed. For, you can still blow the stack if your method recurses enough.

In almost all of the samples you find on traversing a tree structure, you will find recursion used.

Here is an example:

Programming C#, 3rd Edition
By Jesse Liberty
Publisher: O'Reilly
chapter / section 13.2.2.2. Recursing through the subdirectories

I could provide a plethora of examples if I weren't so lazy.

The recursion alternative

When recursion is used to search a directory (and its subdirectories), basically what happens is that, whenever the method finds a subdirectory, the method will call itself and start traversing down the subdirectory's path. It throws a lot of stuff onto the stack.

My method puts all the files and their paths (as strings) into a hashtable which can be used later for searching for more directories and files.

Pseudo code: path searching algorithm

string [] FileList =  GetAllFiles(currDirectory)
// (initial path passed to CreateFileList)

string [] DirList = GetAllDirectories(currDirectory)
if (searchAllSubdirectories == true)
{
    While (DirList.Count > 0)
    // keep calling this until
    // all directories have been checked
    {
        IterateDirectories(ref DirList, ref FileList);
    }
}

// at this point you have a list (FileList)
// of all the files that match your pattern search
// note that since I pop the directories
// off of the DirList (see the 3rd line in pseudo-code 
// IterateDirectories() ), it will be empty at this point

IterateDirectories(ref DirList, ref FileList)
{
    // (initial path passed to CreateFileList)
    FileList +=  GetAllFiles(DirList[0])
    DirList += GetAllDirectories(DirList[0])
    // remove first item (currentItem) from DirList
    RemoveDirItem( DirList[0]);
}

Known Issues

  1. p_TargetDir: if you set the target directory to an invalid path, you'll simply get no files returned.
    • path samples
      • valid -- C:\
      • valid -- C:
      • valid -- C:\temp (if you have a temp dir on your C drive)
      • valid -- C:\temp\ (if you have a temp dir on your C drive)
      • invalid -- C (just the drive name won't work, of course)
  2. p_SearchSubdirs: setting this variable to true with the p_TargetDir to a root level directory can take quite a long time to complete. As of this version, I have no threading (coming soon) or other solution coded into the FindAllFiles example to handle the situation where the interface sits there without updating until the end.

Coming Soon

Search Status

What we need now is a good way to inform the user if the CreateFileList is still doing work, how much work it has already completed, and how much more work it has to do (so users can understand how long the process is going to take.)

My next project is going to use CreateFileList, but will provide a way (via delegates, events) to update the UI so users can see something happening as CreateFileList meanders its way through directory hierarchy.

Process Info Finder

Ever wondered what all those processes are which are running on your PC? Now you can find out where those processes are loaded from (directory path) and who owns them (examining file info for each .exe).

History

  • 11/18/2005 - Posted first version (DLL version 1.0.2148.20332).

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

daylightdj
Software Developer (Senior)
United States United States
No Biography provided

Comments and Discussions

 
QuestionHow fast? PinmemberJeffPClark1-Dec-05 1:54 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Mobile
Web03 | 2.8.140709.1 | Last Updated 25 Nov 2005
Article Copyright 2005 by daylightdj
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid