Click here to Skip to main content
Click here to Skip to main content

A Faster Directory Enumerator

By , 27 Aug 2009
 

Introduction

The .NET Framework's Directory class includes methods for querying the list of files in a directory. For each file, you can also then query the attributes of the file, such as the size, creation date, etc. However, when querying files on a remote PC, this can be very inefficient; a potentially expensive network round-trip is needed to retrieve each file's attributes. This article describes a much more efficient implementation that is approximately 3x faster.

Background

Let's assume you are writing an application that needs to find the most recently modified file in a directory. To implement this, you might have a function similar to the following:

DateTime GetLastFileModifiedSlow(string dir)
{
    DateTime retval = DateTime.MinValue;
    
    string [] files = Directory.GetFiles(dir);
    for (int i=0; i<files.Length; i++)
    {
        DateTime lastWriteTime = File.GetLastWriteTime(files[i]);
        if (lastWriteTime > retval)
        {
            retval = lastWriteTime;
        }
    }
    
    return retval;
}

That function certainly works, but it suffers from some very poor performance characteristics:

  1. GetFiles must allocate a potentially very large array.
  2. GetFiles must wait for the entire directory's entries to be returned before returning.
  3. For each file, a potentially expensive query is sent to the file system. No attempt is made to perform any sort of batch query.

You might think that converting to DirectoryInfo.GetFileSystemInfos would improve item #3:

DateTime GetLastFileModifiedSlow2(string dir)
{
    DateTime retval = DateTime.MinValue;
    
    DirectoryInfo dirInfo = new DirectoryInfo(dir);

    FileInfo[] files = dirInfo.GetFiles();
    for (int i=0; i<files.Length; i++)
    {
        if (files[i].LastWriteTime > retval)
        {
            retval = lastWriteTime;
        }
    }
    
    return retval;
}

This doesn't change anything however: the objects returned by GetFiles() are not initialized with any data, and will all query the file system the first time any property is accessed.

Making it Faster

The attached test application includes the FastDirectoryEnumerator class in FastDirectoryEnumerator.cs. Using the GetFiles method, we can write the equivalent of our first slow method.

DateTime GetLastFileModifiedFast(string dir)
{
    DateTime retval = DateTime.MinValue;
    
    FileData [] files = FastDirectoryEnumerator.GetFiles(dir);
    for (int i=0; i<files.Length; i++)
    {
        if (files[i].LastWriteTime > retval)
        {
            retval = lastWriteTime;
        }
    }
    
    return retval;
}

The FileData object provides all the standard attributes for a file that the FileInfo class provides.

Making it Even Faster

Use one of the overloads of the EnumerateFiles method to enumerate over all the files in a directory. The enumeration returns a FileData object.

Below is an example of the same method using FastDirectoryEnumerator:

DateTime GetLastFileModifiedFast(string dir)
{
    DateTime retval = DateTime.MinValue;

    foreach (FileData f in FastDirectoryEnumerator.EnumerateFiles(dir))
    {
        if (f.LastWriteTime > retval)
        {
            retval = f.LastWriteTime;
        }
    }

    return retval;
}

Performance

The test application allows you to create a large number of files in a directory, then test the time it takes to enumerate using all three methods. I used a directory with 3000 files and ran each test three times to give the best answer possible for each test.

Using a path on my local hard drive resulted in the following times:

  • Directory.GetFiles method: ~225ms
  • DirectoryInfo.GetFiles method: ~230ms
  • FastDirectoryEnumerator.GetFiles method: ~33ms
  • FastDirectoryEnumerator.EnumerateFiles method: ~27ms

That is roughly a 8.5x increase in performance between the fastest and the slowest methods. The performance is even more pronounced when the files are on a UNC path. For this test, I used the same directory as the previous test. The only difference is that I referenced the directory by a UNC share name instead of the local path. At the time of the test, I was connected to my home wireless network.

  • Directory.GetFiles method: ~43,860ms
  • DirectoryInfo.GetFiles method: ~44,000ms
  • FastDirectoryEnumerator.GetFiles method: ~55ms
  • FastDirectoryEnumerator.EnumerateFiles method: ~53ms

That is roughly a 830x increase in performance, and more than 2 orders of magnitude! And, the gap only increases as the latency to the PC containing the files increases.

Why is it Faster?

As mentioned above, Directory.GetFiles and DirectoryInfo.GetFiles have a number of disadvantages. The most significant is that they throw away information and do not efficiently allow you to retrieve information about multiple files at the same time.

Internally, Directory.GetFiles is implemented as a wrapper over the Win32 FindFirstFile/FindNextFile functions. These functions all return information about each file that is enumerated that the GetFiles() method throws away when it returns the file names. They also retrieve information about multiple files with a single network message.

The FastDirectoryEnumerator keeps this information and returns it in the FileData class. This substantially reduces the number of network round-trips needed to accomplish the same task.

History

  • 8-13-2009: Initial version.
  • 8-14-2009: Added security checks, parameter checking, and the GetFiles method.
  • 8-24-2009: Fixed the AllDirectories search using GetFiles. Removed note about .NET 4.0 including something similar.
  • 9-08-2009: Fixed the AllDirectories search when filter is not * or *.*.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

wilsone8
Software Developer (Senior)
United States United States
Member
I've been a software engineer since 1999. I tend to focus on C# and .NET technologies when possible.

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
QuestionBest way to sort FileData[] by CreationDatememberJohnJayAjero30 Apr '13 - 10:09 
Thank you very much for this very helpful article. I need to create an index file based on the results of the GetFiles method. Using Directory.GetFiles, I was able to sort the array as such:
 
var jpgFiles = Directory.GetFiles(destPath, "*.jpg").OrderBy(f => new FileInfo(f).CreationTime);
 
How can I sort the FileData similarly?
AnswerRe: Best way to sort FileData[] by CreationDatememberwilsone83 May '13 - 2:59 
var jpgFiles = FastDirectoryEnumerator.GetFiles(destPath, "*.jpg").OrderBy(f => f.CreationTime);
GeneralMy vote of 5memberMatt Watson (Stackify)15 Apr '13 - 17:24 
Works awesome!
QuestionFile EnumeratormemberOscargt6 Feb '13 - 16:01 
I trying to get the folder (and subfolder size) instead of the (as in your example) the modified date. I actually have this code (newbie I know):
 
Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
       Dim PDRFolder As New DirectoryInfo("C:\InputMedia")
       Dim filesInfo9() As FileInfo = PDRFolder.GetFiles("*.*", SearchOption.AllDirectories)
       Dim fileSizePDR As Long = 0
       Dim fileSize9 As Long = 0
 
       'start storage calculation for Playout PDR Folder

       For Each fileInfo9 As FileInfo In filesInfo9
           fileSize9 += fileInfo9.Length
       Next
 
       fileSizePDR = fileSize9
 
       lblBytes_PDR.Text = fileSizePDR.ToString
       lblSize_PDR.Text = FormatNumber(fileSizePDR / (1024 * 1024 * 1024), 2) + " GB" + "   (" + lblBytes_PDR.Text + " bytes)"
 
       lblFolder_PDR.Text = PDRFolder.ToString
 
       lblHours_PDR.Text = CStr(FormatNumber((fileSizePDR / (1024 * 1024 * 1024)) / 15, 2))
 
       'Convert value in hours and minutes for PDR Folder

       Dim value9 As Decimal = lblHours_PDR.Text
       Dim ts9 As TimeSpan = TimeSpan.FromHours(value9)
       Label5.Text = (String.Format("{0}h {1}m {2}.{3}s", ts9.Hours, ts9.Minutes, ts9.Seconds, ts9.Milliseconds))
 
       hor5 = fileSizePDR
 
       'End storage calculation for PDR Folder

       If Val(lblSize_PDR.Text) > 700 Then  'value in GB
           pbPDR.Visible = True          'show an alarm
           tbLog.Text &= System.DateTime.Now.ToString & vbCrLf
           tbLog.Text &= "L&#237There is too much storage in PDR" & vbCrLf
           tbLog.SelectionStart = tbLog.TextLength
           tbLog.ScrollToCaret()
       Else
           pbPDR.Visible = False
       End If
   End Sub
 
But it lacks of performance on heavy folders. How can I improve it and make it faster? Your demo works on show the differences in speed even when it is only for the modified thing. What is the property for "file size" instead of the "writelastime"?
QuestionDepth first vs. Breadth firstmemberachanlon27 Nov '12 - 5:33 
This is an amazingly useful class. That said, the current implementation uses a depth first search/traversal through the sub-directories. This, in my opinion, is the incorrect method. Most would agree that if they are listing OR searching through a directory structure that you want to see / find the nearest files first. Luckily, simply switching the Stack<> classes to Queue<> classes (and the push/pop to enqueue/dequeue) you get a great breadth first search. Thanks again.
GeneralMy vote of 5memberdesenvolvedor821 Nov '12 - 3:27 
Good job man.
 
http://cavas.com.br
GeneralMy Vote of 5memberJuan R. Huertas10 Oct '12 - 1:32 
Great article. Already tested and it works great.
 
Thanks,
GeneralMy vote of 5memberpdoxtader17 Jul '12 - 4:21 
Nice. Thanks for sharing this!
GeneralMy vote of 5memberxyzabc1233215 Feb '12 - 15:31 
My vote of 5 !
QuestionCOM interop exceptionsmemberlwdaddio1 Feb '12 - 18:41 
Thanks for posting the code. I tried to use it in a Task in .NET 4.0 and got intermittent COM interop exceptions. I also can't use it for getting all subdir files since it throws an exception on access denied files.
QuestionPermissions [modified]memberMember 438704825 Oct '11 - 2:55 
Is there a way to make the application permission aware? By the way, thank you for posting this code. Laugh | :laugh:

modified 25 Oct '11 - 9:39.

GeneralThank you for posting this codememberMember 776647422 Mar '11 - 2:54 
I compiled it as a .dll library and used it in my .vb code for an app. similar to Windows Explorer that i created to gain file system info and access time to a network mapped folder with around 1200 files went from 30-60 seconds down to 1 second! Amazing!
GeneralThanks for posting...memberAshi200230 Nov '10 - 0:54 
Looking this for a very long time... Thank you so much for posting this article. It really help me a lot. Thanks again. Smile | :)
GeneralRe: Thanks for posting...memberferrydebruin6 Apr '11 - 3:53 
I would also like to thank you for posting the article. Solved a big problem!
GeneralMy vote of 5memberAshi200230 Nov '10 - 0:53 
Looking this for a very long time... Thank you so much for posting this article. It really help me a lot. Thanks again.
GeneralDisposing resources [modified]memberMember 54073122 Oct '10 - 9:08 
The class deals with unmanaged resources but I could not find how they get freed. There is FileEnumerator.Dispose but it never gets called. What should be the pattern of using this class without leaking unmanaged resources? especially if something unexpected occurs (i.e. an exception during the enumeration process)?
 
If I try this:
IEnumerable<FileData> enumer = FastDirectoryEnumerator.EnumerateFiles(dir);
using (enumer)
{
foreach (FileData f in enumer)
{ ... }
}
 
I get compilation error as FileEnumerable does not implement IDisposable. FileEnumerator at least has Dispose but I can't call it directly as FileEnumerator is not available directly to the application (FileEnumerable.GetEnumerator is called internally within foreach).
 
Of course, I could get rid of foreach and implement it manually with MoveNext but it would seem a little weird way of doing things.
 
EDIT: Well, I see Dispose is actually called when I set breakpoint there. It seems Dispose is called automatically for IEnumerator witin foreach loop.

modified on Friday, October 22, 2010 3:32 PM

GeneralAccess to the path - deniedmemberdanijel16 Oct '10 - 11:06 
Hi,
 
I am trying to get a list of files via "FastDirectoryEnumerator" class.
when i select on of my drives,example drive d:\ (harddrive).
it produces the error message Access to the path D:\System Volume Information is denied.
 
is there a way to fix this?
 
thanks
GeneralRe: Access to the path - deniedmemberdamage3119 Mar '11 - 13:58 
Wrap your code in a try/catch. You will want to check out this link: http://msdn.microsoft.com/en-us/library/dd383571.aspx[^]
GeneralNice Work!memberMember 38053784 Aug '10 - 14:08 
Thanks wilsone8. This will definitely help me.
GeneralUnhandled Stackoverflow Exception on Windows 7 64bitmemberInsomniac Geek4 May '10 - 11:47 
Hi.
 
I get a stackoverflow exception on this line :
retval = FindNextFile(m_hndFindFile, m_win_find_data)
When it iterates the c:\windows directory, the m_path when it crashes is "c:\windows\winsxs".
 
It seems to only happen in Debug mode, and not in Release.
 
Please advice.
 
Thanks,
/M

GeneralRe: Unhandled Stackoverflow Exception on Windows 7 64bitmembervdhaeyere28 Feb '11 - 4:52 
Hi,
 
Did you ever get to the bottom of this ? I'm having the same exception and would appreciate you sharing your fix if possible.
 
Kind regards,
 
Vincent
GeneralRe: Unhandled Stackoverflow Exception on Windows 7 64bitmemberkamran pervaiz12 Mar '12 - 5:47 
hi,
 
I got the same exception on Win 7 64-bit. Did you found any solution?
 

thanks
Kamran
 
kami
GeneralNice article, but keep an eye on .net 4.0memberSteve Solomon10 Sep '09 - 22:38 
This is a nice article, but you might like to know that there are a number of changes in the .net framework 4.0 that may make you code obsolete. There have been a number of changes with respect to file IO which will hugely increase the performance of GetFiles etc. You can read all about it in Septembers MSDN magazine online.
GeneralRe: Nice article, but keep an eye on .net 4.0memberKD7LRJ1 Oct '09 - 2:06 
This code in .NET 4.0 beta is as fast as GetLastFileModifiedFast2, but not as fast as GetLastFileModifiedFast:
 
  DateTime GetLastFileModifiedFast3(string dir, string searchPattern, SearchOption searchOption)
  {
      DateTime retval = DateTime.MinValue;
 
      foreach (FileInfo f in new DirectoryInfo(dir).EnumerateFiles())
      {
          if (f.LastWriteTimeUtc > retval)
          {
              retval = f.LastWriteTimeUtc;
          }
      }
      return retval;
  }

GeneralRe: Nice article, but keep an eye on .net 4.0memberMember 1853421 Oct '10 - 19:57 
Explanation in MSDN Magazine, sept 09 :
"To address the second issue, DirectoryInfo now makes use of data that the operating system already provides from the file system during enumeration. The underlying Win32 functions that Directory-Info calls to get the contents of the file system during enumeration actually include data about each file, such as the length and creation time. We now use this data when initializing the FileInfo and DirectoryInfo instances returned from both the older array-based and new IEnumerable-basedmethods on DirectoryInfo. This means that in the preceding code, there are no additional underlying calls to the file system to retrieve the length of the file when file.Length is called, since this data has already been initialized."
GeneralNicememberXmen W.K.9 Sep '09 - 16:50 
have a 5
 


TVMU^P[[IGIOQHG^JSH`A#@`RFJ\c^JPL>;"[,*/|+&WLEZGc`AFXc!L
%^]*IRXD#@GKCQ`R\^SF_WcHbORY87֦ʻ6ϣN8ȤBcRAV\Z^&SU~%CSWQ@#2
W_AD`EPABIKRDFVS)EVLQK)JKQUFK[M`UKs*$GwU#QDXBER@CBN%
R0~53%eYrd8mt^7Z6]iTF+(EWfJ9zaK-i’TV.C\y<pŠjxsg-b$f4ia>
-----------------------------------------------
128 bit encrypted signature, crack if you can

GeneralFindFirstFile/FindNextFile as Directory.GetFilesmembersoo2loo28 Aug '09 - 11:20 
I did the similar coding several months ago using Windows API FindFirstFile/FindNextFile and Directory.GetFiles, and compared the results of two methods on a large number of files over network. On the first run, two methods did not make much difference. But ran the program again, FindFirstFile/FindNextFile was faster. You may simulate it using author's code: Click "FastDirectoryEnumerator.EnumerateFiles" several times and compare the results; or exit and run the program again and click "FastDirectoryEnumerator.EnumerateFiles", then compare the results.
GeneralEnumerateFilesmemberTaylorMichaelL26 Aug '09 - 4:06 
Just to be clear the new v4 EnumerateFiles method is not in any way faster than using GetFiles. It uses the exact same process of calling FindFirstFile/FindNextFile as GetFiles. Therefore you have a roundtrip for each file. What makes it better is the perceived speed.
 
With GetFiles you have to wait for the framework to enumerate all the files before you get any results back. For large #s of files this can be really slow. EnumerateFiles uses an iterator so each time you request the next file it makes the roundtrip to fetch the next file. Therefore each iteration the performance is consistent (theoretically) irrelevant of the # of files. Of course the overhead of the iterator means that it will actually take longer overall but (like threading) you won't have the hefty delay.
 
This actually has some implications to how you code. Before if you tried to enumerate a directory of files and one of the files had security that prevented you from reading it then you'd get an exception and lose all files. Now you'll get an exception during the iteration. Another place where things behave differently is in the results. If you use GetFiles then you'll get the list of files available while the method runs. Now you'll potentially (read: depending upon the FindNextFile impl) get the files that were added after the initial call but before the iterator gets to the file.
GeneralEncounters System.IO.PathTooLongExceptionmemberQBUI25 Aug '09 - 13:29 
I gave this code a try and I got this exception.
 
Any ideas how I can workaround this exception.
Thanks,
Quan
 
======= Unit Tests =======
[Test]
public void TestGetAllFilesFromCDrive()
{
foreach (FileData file in FastDirectoryEnumerator.GetFiles(@"C:\", "*", SearchOption.AllDirectories))
{
Console.WriteLine("Name: {0}, Size: {1}", file.Name, file.Size);
}
}
 
======= Exception =======
FileEnumerationTests.TestGetAllDirectoryFilesFromCDrive : FailedSystem.IO.PathTooLongException: The specified path, file name, or both are too long. The fully qualified file name must be less than 260 characters, and the directory name must be less than 248 characters.
at System.IO.Path.SafeSetStackPointerValue(Char* buffer, Int32 index, Char value)
at System.IO.Path.NormalizePathFast(String path, Boolean fullCheck)
at System.IO.Path.NormalizePath(String path, Boolean fullCheck)
at System.IO.Path.GetFullPathInternal(String path)
at System.Security.Util.StringExpressionSet.CanonicalizePath(String path, Boolean needFullPath)
at System.Security.Util.StringExpressionSet.CreateListFromExpressions(String[] str, Boolean needFullPath)
at System.Security.Permissions.FileIOPermission.AddPathList(FileIOPermissionAccess access, AccessControlActions control, String[] pathListOrig, Boolean checkForDuplicates, Boolean needFullPath, Boolean copyPathList)
at System.Security.Permissions.FileIOPermission..ctor(FileIOPermissionAccess access, String path)
at CodeProject.FastDirectoryEnumerator.FileEnumerator.MoveNext() in FastDirectoryEnumerator.cs: line 473
at CodeProject.FastDirectoryEnumerator.FileEnumerator.MoveNext() in FastDirectoryEnumerator.cs: line 514
at CodeProject.FastDirectoryEnumerator.FileEnumerator.MoveNext() in FastDirectoryEnumerator.cs: line 494
at CodeProject.FastDirectoryEnumerator.FileEnumerator.MoveNext() in FastDirectoryEnumerator.cs: line 494
at CodeProject.FastDirectoryEnumerator.FileEnumerator.MoveNext() in FastDirectoryEnumerator.cs: line 514
at CodeProject.FastDirectoryEnumerator.FileEnumerator.MoveNext() in FastDirectoryEnumerator.cs: line 494
at CodeProject.FastDirectoryEnumerator.FileEnumerator.MoveNext() in FastDirectoryEnumerator.cs: line 494
at CodeProject.FastDirectoryEnumerator.FileEnumerator.MoveNext() in FastDirectoryEnumerator.cs: line 514
at CodeProject.FastDirectoryEnumerator.FileEnumerator.MoveNext() in FastDirectoryEnumerator.cs: line 494
at CodeProject.FastDirectoryEnumerator.FileEnumerator.MoveNext() in FastDirectoryEnumerator.cs: line 494
at CodeProject.FastDirectoryEnumerator.FileEnumerator.MoveNext() in FastDirectoryEnumerator.cs: line 514
at CodeProject.FastDirectoryEnumerator.FileEnumerator.MoveNext() in FastDirectoryEnumerator.cs: line 528
at System.Collections.Generic.List`1..ctor(IEnumerable`1 collection)
at CodeProject.FastDirectoryEnumerator.GetFiles(String path, String searchPattern, SearchOption searchOption) in FastDirectoryEnumerator.cs: line 251
at testFastDirectoryEnumeration.FileEnumerationTests.TestGetAllDirectoryFilesFromCDrive() in FileEnumerationTests.cs: line 39
GeneralRe: Encounters System.IO.PathTooLongExceptionmemberwilsone826 Aug '09 - 2:38 
Somewhere on your drive is a path + file name that is greater than 260 characters.
 
You can try prepending '\\?\' to your path string like this to enable very long file names (up to 32k):
 
[Test]
public void TestGetAllFilesFromCDrive()
{ 
   foreach (FileData file in FastDirectoryEnumerator.GetFiles(@"\\?\C:\", "*", SearchOption.AllDirectories))
   {
      Console.WriteLine("Name: {0}, Size: {1}", file.Name, file.Size);
   }
}

GeneralRe: Encounters System.IO.PathTooLongExceptionmemberQBUI26 Aug '09 - 6:58 
Got this exception if I do that
 
System.ArgumentException: Illegal characters in path.
at System.Security.Permissions.FileIOPermission.HasIllegalCharacters(String[] str)
at System.Security.Permissions.FileIOPermission.AddPathList(FileIOPermissionAccess access, AccessControlActions control, String[] pathListOrig, Boolean checkForDuplicates, Boolean needFullPath, Boolean copyPathList)
at System.Security.Permissions.FileIOPermission..ctor(FileIOPermissionAccess access, String[] pathList, Boolean checkForDuplicates, Boolean needFullPath)
at System.IO.Path.GetFullPath(String path)
at CodeProject.FastDirectoryEnumerator.EnumerateFiles(String path, String searchPattern, SearchOption searchOption) in FastDirectoryEnumerator.cs: line 229
at CodeProject.FastDirectoryEnumerator.GetFiles(String path, String searchPattern, SearchOption searchOption) in FastDirectoryEnumerator.cs: line 250
at testFastDirectoryEnumeration.FileEnumerationTests.TestGetAllFilesFromCDrive2() in FileEnumerationTests.cs: line 49
GeneralRe: Encounters System.IO.PathTooLongExceptionmemberwilsone826 Aug '09 - 11:57 
It appears the security check that the code performs does not accept the \\?\ characters. The existing Directory.GetFiles() has this same restriction. I don't know of any easy way to work around this.
 
You can get more information on this at http://blogs.msdn.com/bclteam/archive/2008/06/10/long-paths-in-net-part-3-of-3-kim-hamilton.aspx[^]
GeneralRe: Encounters System.IO.PathTooLongExceptionmemberQBUI27 Aug '09 - 7:19 
Here is my workaround for this issue
 
Modify MoveNext() function (line 472) as following
 
if (m_hndFindFile == null)
{
if (m_path.Length <= 260)
{
new FileIOPermission(FileIOPermissionAccess.PathDiscovery, m_path).Demand();
}
 
string fixPath = @"\\?\" + m_path;
string searchPath = Path.Combine(fixPath, m_filter);
m_hndFindFile = FindFirstFile(searchPath, m_win_find_data);
retval = !m_hndFindFile.IsInvalid;
}

 
Thanks,
Quan
GeneralRe: Encounters System.IO.PathTooLongExceptionmemberwilsone827 Aug '09 - 10:15 
That works, but at the obvious cost of callers will be able to use this class to bypass path discovery security for any path that is longer than 260 characters. If this dll is called from a location that should not have that permission (the web for example, or a network share pre-3.5 SP1), then this could lead to an information leak/security vunerability. Whether that is important in your application is up to you.
GeneralBug with recursion? [modified]memberCorey McKenzie22 Aug '09 - 23:59 
In the method GetFiles, SearchOption.TopDirectoryOnly is being used instead of searchOption.
        public static FileData[] GetFiles(string path, string searchPattern, SearchOption searchOption)
        {
            IEnumerable&lt;FileData&gt; e = FastDirectoryEnumerator.EnumerateFiles(path, searchPattern, SearchOption.TopDirectoryOnly);
            List&lt;FileData&gt; list = new List&lt;FileData&gt;(e);
 
            FileData[] retval = new FileData[list.Count];
            list.CopyTo(retval);
 
            return retval;
        }
Even when I change the code to use searchOption and pass in SearchOption.AllDirectories it doesn't work as expected. I don't get the same values returned as Directory.GetFiles. I get only the files in the top directory (if any).
 
Any ideas?
 
modified on Sunday, August 23, 2009 9:51 AM

GeneralRe: Bug with recursion? [modified]memberwhizrd23 Aug '09 - 4:29 
Same here. The real performance savings of this routine would be with the AllDirectories option, but it doesn't work. Sure runs fast though!
 
I tried the same fix Corey did, with the same results... no change. I'm running WinXP Pro SP3.
 
Update: OK, it looks like the other line that needs changing is the test for FileAttributes.Directory in MoveNext, so change this:
if ((FileAttributes)m_win_find_data.dwFileAttributes == FileAttributes.Directory)
to this:
if (((FileAttributes)m_win_find_data.dwFileAttributes & FileAttributes.Directory) == FileAttributes.Directory)
and all is well. Still runs like lightning!
 
modified on Sunday, August 23, 2009 10:59 PM

GeneralRe: Bug with recursion? [modified]memberwilsone824 Aug '09 - 3:18 
Good catch. I've submitted new code with this fix, and it should be available shortly. Thanks for the help.
 
Update: New version is now posted.
 
modified on Thursday, August 27, 2009 8:44 AM

GeneralRe: Bug with recursion?memberHeywood27 Aug '09 - 11:47 
Recursion doesn't seem to be working at all now...
GeneralRe: Bug with recursion?memberwilsone828 Aug '09 - 4:16 
How so? I tried it on my machine and it works fine for me. Are you getting an exception or just not seeing all the files? Note that the overload EnumerateFiles with no parameters only enumerates the current directory. You need to explicitly call it with a SearchOption if you want to enumerate sub-directories.
GeneralRe: Bug with recursion?memberHeywood30 Aug '09 - 7:19 
I've done small amount of debugging. It seems the problem occurs if you specify a search pattern. Child folders aren't searched unless they have the same extension as the search pattern.
 

GeneralRe: Bug with recursion?memberHeywood8 Sep '09 - 3:25 
Any chance you've fixed this?
GeneralRe: Bug with recursion? [modified]memberwilsone88 Sep '09 - 11:09 
Thanks for pointing this out. I've submitted a fix for this issue. It may be a few days before it appears however.
 
EDIT: Update is now available.
 
modified on Thursday, September 10, 2009 10:26 AM

GeneralGood job...memberAndrew Rissing21 Aug '09 - 4:16 
The article looks clean and the code is well documented - got my 5.
 
I was going to recommend implementing additional features to take advantage of predicates/actions much like a List does. Though after scanning your code, it might not be as simple to implement or really buy you much.
 
But either way, good stuff - thanks for sharing.
QuestionFSO alternative?memberaikimark18 Aug '09 - 8:03 
@wilsone8
 
Did you also look at FSO (FileSystemObject) enumeration?
 
I don't expect it to best your best times, but I was curious if you had tested other, out-of-the-box methods.
GeneralGoodmemberPaulo Zemek17 Aug '09 - 2:08 
Good one. I never thought I needed this until I see it.
I really never look at the performance differences of using DirectoryInfo.GetFiles and GetDirectories, I always thought they were very small one, not such big ones.
Good article.
Generalmy vote of 5mvpLuc Pattyn13 Aug '09 - 15:55 
well done. Thinking outside the box, or the Framework. I like it.
 
Smile | :)
 
Luc Pattyn [Forum Guidelines] [My Articles]

The quality and detail of your question reflects on the effectiveness of the help you are likely to get.
Show formatted code inside PRE tags, and give clear symptoms when describing a problem.

GeneralRe: my vote of 5 [modified]memberPaul Selormey13 Aug '09 - 19:04 
Luc Pattyn wrote:
well done. Thinking outside the box, or the Framework. I like it.

If you close your eyes on some of the technical side of handling files in .NET, then you have a point.
 
>> ..but it suffers from some very poor performance characteristics:
>> 1. GetFiles must allocate a potentially very large array.
True that is why they are including the new method. Directory.EnumerateFiles.
>> 2. GetFiles must wait for the entire directory's entries to be returned before returning.
The same as 1, so same problem.
>> 3. 3.For each file, a potentially expensive query is sent to the file system. No attempt is made to perform any sort of batch query.
If that means; file size, file dates, then true. Other than that there is nothing special here.
 
This article boast of speed, but ignores one essential part of .NET file systems, security. There is no single attempt to demand/check file security, which is done extensively in the Directory.GetFiles.
 
>> Sadly, it will still only return file names...
Sadly, this is misinformation and lack of understanding...
1. Directory.EnumerateFiles() is designed to return only the file names, which is required by most applications, and will be the replacement of Directory.GetFiles(). This essentially prevent creating useless classes; FileInfo or FileData (if you prefer that).
 
2. DirectoryInfo.EnumerateFiles(), is designed for those wanting more information on the files, and it is the replacement of the DirectoryInfo.GetFiles, and unlike the DirectoryInfo.GetFiles(), this does not use the Directory.GetFiles internally.
 
Basically, this article like many codes out there, including the MSDN version, is to reduce the amount of memory required when dealing with large files, and knowing the environment you are using it.
I have used the MSDN version, removed the FileInfo it returns for just the file name, when creating a tool for Sandcastle.
 
Best regards,
Paul.
 

Jesus Christ is LOVE! Please tell somebody.
modified on Friday, August 14, 2009 6:33 AM

GeneralRe: my vote of 5 [modified]memberwilsone814 Aug '09 - 8:46 
Paul Selormey wrote:
If that means; file size, file dates, then true. Other than that there is nothing special here.

 
Getting additional attributes for each file is the whole point of this code! If all you need is file names, then of course this is not interesting. I'm pointing out that if you need file attributes and not just names, then the .Net Framework's built in methods are not the most efficient way to go.
 
Paul Selormey wrote:
This article boast of speed, but ignores one essential part of .NET file systems, security. There is no single attempt to demand/check file security, which is done extensively in the Directory.GetFiles.

 
This is true in v1 of this article. I've added the same security checks that the .Net framework does and the performance doesn't change at all.
 
Paul Selormey wrote:
2. DirectoryInfo.EnumerateFiles(), is designed for those wanting more information on the files, and it is the replacement of the DirectoryInfo.GetFiles, and unlike the DirectoryInfo.GetFiles(), this does not use the Directory.GetFiles internally.

 
If it works anything like the current DirectoryInfo.GetFileSystemInfos, then like I said above it will be no faster than creating a bunch of FileInfo objects. Its not returning an enumeration that makes my method faster. In fact, just to prove this I've added a GetFiles method to the FastDirectoryEnumerator that returns an array of FileData objects. The two methods are always within 5% of each other.
 
The problem with DirectoryInfo.GetFileSystemInfos (or any of the methods that return a FileInfo object) is that internally the FileInfo objects only stores a file name on construction. All the other information that is returned by FindFirstFile/FindNextFile is thrown away and than re-queried when you request the first attribute of a file. By keeping that data around my code is significantly faster, especially in the face of network latencies. Maybe in .Net 4.0 this is changed (I don't have Beta 1 of 4.0 installed; I don't play around with Beta software), but in the mean time I think this code is useful for some people.
 
modified on Friday, August 14, 2009 2:53 PM

GeneralRe: my vote of 5memberPaul Selormey14 Aug '09 - 11:50 
wilsone8 wrote:
Getting additional attributes for each file is the whole point of this code!...

First of all, I do think this article is useful for many uses, and as I have said, I have used a similar code, which appeared in the MSDN mag.
 
wilsone8 wrote:
This is true in v1 of this article. I've added the same security checks that the .Net framework does and the performance doesn't change at all.

Lets see how it performs, hope you will post the update soon. Without a security check/demand, and no word of caution users could run into problems.
 
wilsone8 wrote:
I don't have Beta 1 of 4.0 installed; I don't play around with Beta software

Then, it was not right making statements about it, especially not making the difference between the two methods provided by the .NET and what they are supposed to do.
I have .NET 4.0 beta 1 installed on a VPC, and play with it, and with Reflector, it was easy to see how it works.
Even now, FileInfo/FileSystemInfo internally uses the Win32 data, what was lacking was the right iterator, and .NET 4.0 provides at least three of them, through a factory. Unlike the .NET 2.x, FileSystemInfo now has internal method, InitializeFrom(Win32Native.WIN32_FIND_DATA findData), so yes there is a difference.
 
wilsone8 wrote:
...but in the mean time I think this code is useful for some people.

It is useful, and I will use it when I have the need, just avoid the extra unverified information.
 
NB: If you are only looking for files use the DirectoryInfo.GetFiles instead of the DirectoryInfo.GetFileSystemInfos to avoid the extra
((files[i].Attributes & FileAttributes.Directory) == 0) checks.
 
Best regards,
Paul.
 

Jesus Christ is LOVE! Please tell somebody.

GeneralRe: my vote of 5memberwilsone827 Aug '09 - 10:16 
New version is posted. This verison includes the security checks I mentioned. I've removed references to .NET 4.0.

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web04 | 2.6.130523.1 | Last Updated 27 Aug 2009
Article Copyright 2009 by wilsone8
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid