Click here to Skip to main content
13,146,014 members (42,445 online)
Click here to Skip to main content
Add your own
alternative version

Stats

196.2K views
16.2K downloads
153 bookmarked
Posted 13 Aug 2009

A Faster Directory Enumerator

, 27 Aug 2009
Rate this:
Please Sign up or sign in to vote.
Describes how to create a significantly faster enumerator for the attributes of all the files in a directory.

Introduction

The .NET Framework's Directory class includes methods for querying the list of files in a directory. For each file, you can also then query the attributes of the file, such as the size, creation date, etc. However, when querying files on a remote PC, this can be very inefficient; a potentially expensive network round-trip is needed to retrieve each file's attributes. This article describes a much more efficient implementation that is approximately 3x faster.

Background

Let's assume you are writing an application that needs to find the most recently modified file in a directory. To implement this, you might have a function similar to the following:

DateTime GetLastFileModifiedSlow(string dir)
{
    DateTime retval = DateTime.MinValue;
    
    string [] files = Directory.GetFiles(dir);
    for (int i=0; i<files.Length; i++)
    {
        DateTime lastWriteTime = File.GetLastWriteTime(files[i]);
        if (lastWriteTime > retval)
        {
            retval = lastWriteTime;
        }
    }
    
    return retval;
}

That function certainly works, but it suffers from some very poor performance characteristics:

  1. GetFiles must allocate a potentially very large array.
  2. GetFiles must wait for the entire directory's entries to be returned before returning.
  3. For each file, a potentially expensive query is sent to the file system. No attempt is made to perform any sort of batch query.

You might think that converting to DirectoryInfo.GetFileSystemInfos would improve item #3:

DateTime GetLastFileModifiedSlow2(string dir)
{
    DateTime retval = DateTime.MinValue;
    
    DirectoryInfo dirInfo = new DirectoryInfo(dir);

    FileInfo[] files = dirInfo.GetFiles();
    for (int i=0; i<files.Length; i++)
    {
        if (files[i].LastWriteTime > retval)
        {
            retval = lastWriteTime;
        }
    }
    
    return retval;
}

This doesn't change anything however: the objects returned by GetFiles() are not initialized with any data, and will all query the file system the first time any property is accessed.

Making it Faster

The attached test application includes the FastDirectoryEnumerator class in FastDirectoryEnumerator.cs. Using the GetFiles method, we can write the equivalent of our first slow method.

DateTime GetLastFileModifiedFast(string dir)
{
    DateTime retval = DateTime.MinValue;
    
    FileData [] files = FastDirectoryEnumerator.GetFiles(dir);
    for (int i=0; i<files.Length; i++)
    {
        if (files[i].LastWriteTime > retval)
        {
            retval = lastWriteTime;
        }
    }
    
    return retval;
}

The FileData object provides all the standard attributes for a file that the FileInfo class provides.

Making it Even Faster

Use one of the overloads of the EnumerateFiles method to enumerate over all the files in a directory. The enumeration returns a FileData object.

Below is an example of the same method using FastDirectoryEnumerator:

DateTime GetLastFileModifiedFast(string dir)
{
    DateTime retval = DateTime.MinValue;

    foreach (FileData f in FastDirectoryEnumerator.EnumerateFiles(dir))
    {
        if (f.LastWriteTime > retval)
        {
            retval = f.LastWriteTime;
        }
    }

    return retval;
}

Performance

The test application allows you to create a large number of files in a directory, then test the time it takes to enumerate using all three methods. I used a directory with 3000 files and ran each test three times to give the best answer possible for each test.

Using a path on my local hard drive resulted in the following times:

  • Directory.GetFiles method: ~225ms
  • DirectoryInfo.GetFiles method: ~230ms
  • FastDirectoryEnumerator.GetFiles method: ~33ms
  • FastDirectoryEnumerator.EnumerateFiles method: ~27ms

That is roughly a 8.5x increase in performance between the fastest and the slowest methods. The performance is even more pronounced when the files are on a UNC path. For this test, I used the same directory as the previous test. The only difference is that I referenced the directory by a UNC share name instead of the local path. At the time of the test, I was connected to my home wireless network.

  • Directory.GetFiles method: ~43,860ms
  • DirectoryInfo.GetFiles method: ~44,000ms
  • FastDirectoryEnumerator.GetFiles method: ~55ms
  • FastDirectoryEnumerator.EnumerateFiles method: ~53ms

That is roughly a 830x increase in performance, and more than 2 orders of magnitude! And, the gap only increases as the latency to the PC containing the files increases.

Why is it Faster?

As mentioned above, Directory.GetFiles and DirectoryInfo.GetFiles have a number of disadvantages. The most significant is that they throw away information and do not efficiently allow you to retrieve information about multiple files at the same time.

Internally, Directory.GetFiles is implemented as a wrapper over the Win32 FindFirstFile/FindNextFile functions. These functions all return information about each file that is enumerated that the GetFiles() method throws away when it returns the file names. They also retrieve information about multiple files with a single network message.

The FastDirectoryEnumerator keeps this information and returns it in the FileData class. This substantially reduces the number of network round-trips needed to accomplish the same task.

History

  • 8-13-2009: Initial version.
  • 8-14-2009: Added security checks, parameter checking, and the GetFiles method.
  • 8-24-2009: Fixed the AllDirectories search using GetFiles. Removed note about .NET 4.0 including something similar.
  • 9-08-2009: Fixed the AllDirectories search when filter is not * or *.*.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

wilsone8
Software Developer (Senior)
United States United States
I've been a software engineer since 1999. I tend to focus on C# and .NET technologies when possible.

You may also be interested in...

Comments and Discussions

 
GeneralRe: Thanks for posting... Pin
ferrydebruin6-Apr-11 3:53
memberferrydebruin6-Apr-11 3:53 
GeneralMy vote of 5 Pin
Ashi200230-Nov-10 0:53
memberAshi200230-Nov-10 0:53 
GeneralDisposing resources [modified] Pin
Member 54073122-Oct-10 9:08
memberMember 54073122-Oct-10 9:08 
GeneralAccess to the path - denied Pin
danijel16-Oct-10 11:06
memberdanijel16-Oct-10 11:06 
GeneralRe: Access to the path - denied Pin
damage3119-Mar-11 13:58
memberdamage3119-Mar-11 13:58 
GeneralNice Work! Pin
Member 38053784-Aug-10 14:08
memberMember 38053784-Aug-10 14:08 
GeneralUnhandled Stackoverflow Exception on Windows 7 64bit Pin
Insomniac Geek4-May-10 11:47
memberInsomniac Geek4-May-10 11:47 
GeneralRe: Unhandled Stackoverflow Exception on Windows 7 64bit Pin
vdhaeyere28-Feb-11 4:52
membervdhaeyere28-Feb-11 4:52 
GeneralRe: Unhandled Stackoverflow Exception on Windows 7 64bit Pin
kamran pervaiz12-Mar-12 5:47
memberkamran pervaiz12-Mar-12 5:47 
GeneralRe: Unhandled Stackoverflow Exception on Windows 7 64bit Pin
H. Engelhadt27-Jul-14 23:33
memberH. Engelhadt27-Jul-14 23:33 
GeneralRe: Unhandled Stackoverflow Exception on Windows 7 64bit Pin
Dave Whiteford14-Aug-14 2:04
memberDave Whiteford14-Aug-14 2:04 
GeneralRe: Unhandled Stackoverflow Exception on Windows 7 64bit Pin
H. Engelhadt14-Aug-14 7:21
memberH. Engelhadt14-Aug-14 7:21 
GeneralNice article, but keep an eye on .net 4.0 Pin
Steve Solomon10-Sep-09 22:38
memberSteve Solomon10-Sep-09 22:38 
GeneralRe: Nice article, but keep an eye on .net 4.0 Pin
KD7LRJ1-Oct-09 2:06
memberKD7LRJ1-Oct-09 2:06 
This code in .NET 4.0 beta is as fast as GetLastFileModifiedFast2, but not as fast as GetLastFileModifiedFast:

  DateTime GetLastFileModifiedFast3(string dir, string searchPattern, SearchOption searchOption)
  {
      DateTime retval = DateTime.MinValue;
 
      foreach (FileInfo f in new DirectoryInfo(dir).EnumerateFiles())
      {
          if (f.LastWriteTimeUtc > retval)
          {
              retval = f.LastWriteTimeUtc;
          }
      }
      return retval;
  }

GeneralRe: Nice article, but keep an eye on .net 4.0 Pin
Member 1853421-Oct-10 19:57
memberMember 1853421-Oct-10 19:57 
GeneralNice Pin
Xmen W.K.9-Sep-09 16:50
memberXmen W.K.9-Sep-09 16:50 
GeneralFindFirstFile/FindNextFile as Directory.GetFiles Pin
soo2loo28-Aug-09 11:20
membersoo2loo28-Aug-09 11:20 
GeneralEnumerateFiles Pin
TaylorMichaelL26-Aug-09 4:06
memberTaylorMichaelL26-Aug-09 4:06 
GeneralEncounters System.IO.PathTooLongException Pin
QBUI25-Aug-09 13:29
memberQBUI25-Aug-09 13:29 
GeneralRe: Encounters System.IO.PathTooLongException Pin
wilsone826-Aug-09 2:38
memberwilsone826-Aug-09 2:38 
GeneralRe: Encounters System.IO.PathTooLongException Pin
QBUI26-Aug-09 6:58
memberQBUI26-Aug-09 6:58 
GeneralRe: Encounters System.IO.PathTooLongException Pin
wilsone826-Aug-09 11:57
memberwilsone826-Aug-09 11:57 
GeneralRe: Encounters System.IO.PathTooLongException Pin
QBUI27-Aug-09 7:19
memberQBUI27-Aug-09 7:19 
GeneralRe: Encounters System.IO.PathTooLongException Pin
wilsone827-Aug-09 10:15
memberwilsone827-Aug-09 10:15 
GeneralBug with recursion? [modified] Pin
Corey McKenzie22-Aug-09 23:59
memberCorey McKenzie22-Aug-09 23:59 
GeneralRe: Bug with recursion? [modified] Pin
whizrd23-Aug-09 4:29
memberwhizrd23-Aug-09 4:29 
GeneralRe: Bug with recursion? [modified] Pin
wilsone824-Aug-09 3:18
memberwilsone824-Aug-09 3:18 
GeneralRe: Bug with recursion? Pin
Heywood27-Aug-09 11:47
memberHeywood27-Aug-09 11:47 
GeneralRe: Bug with recursion? Pin
wilsone828-Aug-09 4:16
memberwilsone828-Aug-09 4:16 
GeneralRe: Bug with recursion? Pin
Heywood30-Aug-09 7:19
memberHeywood30-Aug-09 7:19 
GeneralRe: Bug with recursion? Pin
Heywood8-Sep-09 3:25
memberHeywood8-Sep-09 3:25 
GeneralRe: Bug with recursion? [modified] Pin
wilsone88-Sep-09 11:09
memberwilsone88-Sep-09 11:09 
GeneralGood job... Pin
Andrew Rissing21-Aug-09 4:16
memberAndrew Rissing21-Aug-09 4:16 
QuestionFSO alternative? Pin
aikimark18-Aug-09 8:03
memberaikimark18-Aug-09 8:03 
GeneralGood Pin
Paulo Zemek17-Aug-09 2:08
memberPaulo Zemek17-Aug-09 2:08 
Generalmy vote of 5 Pin
Luc Pattyn13-Aug-09 15:55
mvpLuc Pattyn13-Aug-09 15:55 
GeneralRe: my vote of 5 [modified] Pin
Paul Selormey13-Aug-09 19:04
memberPaul Selormey13-Aug-09 19:04 
GeneralRe: my vote of 5 [modified] Pin
wilsone814-Aug-09 8:46
memberwilsone814-Aug-09 8:46 
GeneralRe: my vote of 5 Pin
Paul Selormey14-Aug-09 11:50
memberPaul Selormey14-Aug-09 11:50 
GeneralRe: my vote of 5 Pin
wilsone827-Aug-09 10:16
memberwilsone827-Aug-09 10:16 
Generalmy vote of 5 Pin
Paw Jershauge13-Aug-09 20:50
memberPaw Jershauge13-Aug-09 20:50 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

Permalink | Advertise | Privacy | Terms of Use | Mobile
Web04 | 2.8.170915.1 | Last Updated 27 Aug 2009
Article Copyright 2009 by wilsone8
Everything else Copyright © CodeProject, 1999-2017
Layout: fixed | fluid