|
|
Comments and Discussions
|
|
 |
|

|
Thank you very much for this very helpful article. I need to create an index file based on the results of the GetFiles method. Using Directory.GetFiles, I was able to sort the array as such:
var jpgFiles = Directory.GetFiles(destPath, "*.jpg").OrderBy(f => new FileInfo(f).CreationTime);
How can I sort the FileData similarly?
|
|
|
|

|
var jpgFiles = FastDirectoryEnumerator.GetFiles(destPath, "*.jpg").OrderBy(f => f.CreationTime);
|
|
|
|
|

|
I trying to get the folder (and subfolder size) instead of the (as in your example) the modified date. I actually have this code (newbie I know):
Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
Dim PDRFolder As New DirectoryInfo("C:\InputMedia")
Dim filesInfo9() As FileInfo = PDRFolder.GetFiles("*.*", SearchOption.AllDirectories)
Dim fileSizePDR As Long = 0
Dim fileSize9 As Long = 0
For Each fileInfo9 As FileInfo In filesInfo9
fileSize9 += fileInfo9.Length
Next
fileSizePDR = fileSize9
lblBytes_PDR.Text = fileSizePDR.ToString
lblSize_PDR.Text = FormatNumber(fileSizePDR / (1024 * 1024 * 1024), 2) + " GB" + " (" + lblBytes_PDR.Text + " bytes)"
lblFolder_PDR.Text = PDRFolder.ToString
lblHours_PDR.Text = CStr(FormatNumber((fileSizePDR / (1024 * 1024 * 1024)) / 15, 2))
Dim value9 As Decimal = lblHours_PDR.Text
Dim ts9 As TimeSpan = TimeSpan.FromHours(value9)
Label5.Text = (String.Format("{0}h {1}m {2}.{3}s", ts9.Hours, ts9.Minutes, ts9.Seconds, ts9.Milliseconds))
hor5 = fileSizePDR
If Val(lblSize_PDR.Text) > 700 Then pbPDR.Visible = True tbLog.Text &= System.DateTime.Now.ToString & vbCrLf
tbLog.Text &= "LíThere is too much storage in PDR" & vbCrLf
tbLog.SelectionStart = tbLog.TextLength
tbLog.ScrollToCaret()
Else
pbPDR.Visible = False
End If
End Sub
But it lacks of performance on heavy folders. How can I improve it and make it faster? Your demo works on show the differences in speed even when it is only for the modified thing. What is the property for "file size" instead of the "writelastime"?
|
|
|
|

|
This is an amazingly useful class. That said, the current implementation uses a depth first search/traversal through the sub-directories. This, in my opinion, is the incorrect method. Most would agree that if they are listing OR searching through a directory structure that you want to see / find the nearest files first. Luckily, simply switching the Stack<> classes to Queue<> classes (and the push/pop to enqueue/dequeue) you get a great breadth first search. Thanks again.
|
|
|
|

|
Good job man.
http://cavas.com.br
|
|
|
|

|
Great article. Already tested and it works great.
Thanks,
|
|
|
|

|
Nice. Thanks for sharing this!
|
|
|
|
|

|
Thanks for posting the code. I tried to use it in a Task in .NET 4.0 and got intermittent COM interop exceptions. I also can't use it for getting all subdir files since it throws an exception on access denied files.
|
|
|
|

|
Is there a way to make the application permission aware? By the way, thank you for posting this code.
modified 25-Oct-11 9:39am.
|
|
|
|

|
I compiled it as a .dll library and used it in my .vb code for an app. similar to Windows Explorer that i created to gain file system info and access time to a network mapped folder with around 1200 files went from 30-60 seconds down to 1 second! Amazing!
|
|
|
|

|
Looking this for a very long time... Thank you so much for posting this article. It really help me a lot. Thanks again.
|
|
|
|

|
I would also like to thank you for posting the article. Solved a big problem!
|
|
|
|

|
Looking this for a very long time... Thank you so much for posting this article. It really help me a lot. Thanks again.
|
|
|
|

|
The class deals with unmanaged resources but I could not find how they get freed. There is FileEnumerator.Dispose but it never gets called. What should be the pattern of using this class without leaking unmanaged resources? especially if something unexpected occurs (i.e. an exception during the enumeration process)?
If I try this:
IEnumerable<FileData> enumer = FastDirectoryEnumerator.EnumerateFiles(dir);
using (enumer)
{
foreach (FileData f in enumer)
{ ... }
}
I get compilation error as FileEnumerable does not implement IDisposable. FileEnumerator at least has Dispose but I can't call it directly as FileEnumerator is not available directly to the application (FileEnumerable.GetEnumerator is called internally within foreach).
Of course, I could get rid of foreach and implement it manually with MoveNext but it would seem a little weird way of doing things.
EDIT: Well, I see Dispose is actually called when I set breakpoint there. It seems Dispose is called automatically for IEnumerator witin foreach loop.
modified on Friday, October 22, 2010 3:32 PM
|
|
|
|

|
Hi,
I am trying to get a list of files via "FastDirectoryEnumerator" class.
when i select on of my drives,example drive d:\ (harddrive).
it produces the error message Access to the path D:\System Volume Information is denied.
is there a way to fix this?
thanks
|
|
|
|
|

|
Thanks wilsone8. This will definitely help me.
|
|
|
|

|
Hi.
I get a stackoverflow exception on this line :
retval = FindNextFile(m_hndFindFile, m_win_find_data)
When it iterates the c:\windows directory, the m_path when it crashes is "c:\windows\winsxs".
It seems to only happen in Debug mode, and not in Release.
Please advice.
Thanks,
/M
|
|
|
|

|
Hi,
Did you ever get to the bottom of this ? I'm having the same exception and would appreciate you sharing your fix if possible.
Kind regards,
Vincent
|
|
|
|

|
hi, I got the same exception on Win 7 64-bit. Did you found any solution?
thanks Kamran kami
|
|
|
|

|
This is a nice article, but you might like to know that there are a number of changes in the .net framework 4.0 that may make you code obsolete. There have been a number of changes with respect to file IO which will hugely increase the performance of GetFiles etc. You can read all about it in Septembers MSDN magazine online.
|
|
|
|

|
This code in .NET 4.0 beta is as fast as GetLastFileModifiedFast2, but not as fast as GetLastFileModifiedFast:
DateTime GetLastFileModifiedFast3(string dir, string searchPattern, SearchOption searchOption)
{
DateTime retval = DateTime.MinValue;
foreach (FileInfo f in new DirectoryInfo(dir).EnumerateFiles())
{
if (f.LastWriteTimeUtc > retval)
{
retval = f.LastWriteTimeUtc;
}
}
return retval;
}
|
|
|
|

|
Explanation in MSDN Magazine, sept 09 :
"To address the second issue, DirectoryInfo now makes use of data that the operating system already provides from the file system during enumeration. The underlying Win32 functions that Directory-Info calls to get the contents of the file system during enumeration actually include data about each file, such as the length and creation time. We now use this data when initializing the FileInfo and DirectoryInfo instances returned from both the older array-based and new IEnumerable-basedmethods on DirectoryInfo. This means that in the preceding code, there are no additional underlying calls to the file system to retrieve the length of the file when file.Length is called, since this data has already been initialized."
|
|
|
|

|
have a 5
TVMU^P[[IGIOQHG^JSH`A#@`RFJ\c^JPL>;"[,*/|+&WLEZGc`AFXc!L
%^]*IRXD#@GKCQ`R\^SF_WcHbORY87֦ʻ6ϣN8ȤBcRAV\Z^&SU~%CSWQ@#2
W_AD`EPABIKRDFVS)EVLQK)JKQUFK[M`UKs*$GwU#QDXBER@CBN%
R0~53%eYrd8mt^7Z6]iTF+(EWfJ9zaK-iTV.C\y<pjxsg-b$f4ia>
-----------------------------------------------
128 bit encrypted signature, crack if you can
|
|
|
|

|
I did the similar coding several months ago using Windows API FindFirstFile/FindNextFile and Directory.GetFiles, and compared the results of two methods on a large number of files over network. On the first run, two methods did not make much difference. But ran the program again, FindFirstFile/FindNextFile was faster. You may simulate it using author's code: Click "FastDirectoryEnumerator.EnumerateFiles" several times and compare the results; or exit and run the program again and click "FastDirectoryEnumerator.EnumerateFiles", then compare the results.
|
|
|
|

|
Just to be clear the new v4 EnumerateFiles method is not in any way faster than using GetFiles. It uses the exact same process of calling FindFirstFile/FindNextFile as GetFiles. Therefore you have a roundtrip for each file. What makes it better is the perceived speed.
With GetFiles you have to wait for the framework to enumerate all the files before you get any results back. For large #s of files this can be really slow. EnumerateFiles uses an iterator so each time you request the next file it makes the roundtrip to fetch the next file. Therefore each iteration the performance is consistent (theoretically) irrelevant of the # of files. Of course the overhead of the iterator means that it will actually take longer overall but (like threading) you won't have the hefty delay.
This actually has some implications to how you code. Before if you tried to enumerate a directory of files and one of the files had security that prevented you from reading it then you'd get an exception and lose all files. Now you'll get an exception during the iteration. Another place where things behave differently is in the results. If you use GetFiles then you'll get the list of files available while the method runs. Now you'll potentially (read: depending upon the FindNextFile impl) get the files that were added after the initial call but before the iterator gets to the file.
|
|
|
|

|
I gave this code a try and I got this exception.
Any ideas how I can workaround this exception.
Thanks,
Quan
======= Unit Tests =======
[Test]
public void TestGetAllFilesFromCDrive()
{
foreach (FileData file in FastDirectoryEnumerator.GetFiles(@"C:\", "*", SearchOption.AllDirectories))
{
Console.WriteLine("Name: {0}, Size: {1}", file.Name, file.Size);
}
}
======= Exception =======
FileEnumerationTests.TestGetAllDirectoryFilesFromCDrive : FailedSystem.IO.PathTooLongException: The specified path, file name, or both are too long. The fully qualified file name must be less than 260 characters, and the directory name must be less than 248 characters.
at System.IO.Path.SafeSetStackPointerValue(Char* buffer, Int32 index, Char value)
at System.IO.Path.NormalizePathFast(String path, Boolean fullCheck)
at System.IO.Path.NormalizePath(String path, Boolean fullCheck)
at System.IO.Path.GetFullPathInternal(String path)
at System.Security.Util.StringExpressionSet.CanonicalizePath(String path, Boolean needFullPath)
at System.Security.Util.StringExpressionSet.CreateListFromExpressions(String[] str, Boolean needFullPath)
at System.Security.Permissions.FileIOPermission.AddPathList(FileIOPermissionAccess access, AccessControlActions control, String[] pathListOrig, Boolean checkForDuplicates, Boolean needFullPath, Boolean copyPathList)
at System.Security.Permissions.FileIOPermission..ctor(FileIOPermissionAccess access, String path)
at CodeProject.FastDirectoryEnumerator.FileEnumerator.MoveNext() in FastDirectoryEnumerator.cs: line 473
at CodeProject.FastDirectoryEnumerator.FileEnumerator.MoveNext() in FastDirectoryEnumerator.cs: line 514
at CodeProject.FastDirectoryEnumerator.FileEnumerator.MoveNext() in FastDirectoryEnumerator.cs: line 494
at CodeProject.FastDirectoryEnumerator.FileEnumerator.MoveNext() in FastDirectoryEnumerator.cs: line 494
at CodeProject.FastDirectoryEnumerator.FileEnumerator.MoveNext() in FastDirectoryEnumerator.cs: line 514
at CodeProject.FastDirectoryEnumerator.FileEnumerator.MoveNext() in FastDirectoryEnumerator.cs: line 494
at CodeProject.FastDirectoryEnumerator.FileEnumerator.MoveNext() in FastDirectoryEnumerator.cs: line 494
at CodeProject.FastDirectoryEnumerator.FileEnumerator.MoveNext() in FastDirectoryEnumerator.cs: line 514
at CodeProject.FastDirectoryEnumerator.FileEnumerator.MoveNext() in FastDirectoryEnumerator.cs: line 494
at CodeProject.FastDirectoryEnumerator.FileEnumerator.MoveNext() in FastDirectoryEnumerator.cs: line 494
at CodeProject.FastDirectoryEnumerator.FileEnumerator.MoveNext() in FastDirectoryEnumerator.cs: line 514
at CodeProject.FastDirectoryEnumerator.FileEnumerator.MoveNext() in FastDirectoryEnumerator.cs: line 528
at System.Collections.Generic.List`1..ctor(IEnumerable`1 collection)
at CodeProject.FastDirectoryEnumerator.GetFiles(String path, String searchPattern, SearchOption searchOption) in FastDirectoryEnumerator.cs: line 251
at testFastDirectoryEnumeration.FileEnumerationTests.TestGetAllDirectoryFilesFromCDrive() in FileEnumerationTests.cs: line 39
|
|
|
|

|
Somewhere on your drive is a path + file name that is greater than 260 characters.
You can try prepending '\\?\' to your path string like this to enable very long file names (up to 32k):
[Test]
public void TestGetAllFilesFromCDrive()
{
foreach (FileData file in FastDirectoryEnumerator.GetFiles(@"\\?\C:\", "*", SearchOption.AllDirectories))
{
Console.WriteLine("Name: {0}, Size: {1}", file.Name, file.Size);
}
}
|
|
|
|

|
Got this exception if I do that
System.ArgumentException: Illegal characters in path.
at System.Security.Permissions.FileIOPermission.HasIllegalCharacters(String[] str)
at System.Security.Permissions.FileIOPermission.AddPathList(FileIOPermissionAccess access, AccessControlActions control, String[] pathListOrig, Boolean checkForDuplicates, Boolean needFullPath, Boolean copyPathList)
at System.Security.Permissions.FileIOPermission..ctor(FileIOPermissionAccess access, String[] pathList, Boolean checkForDuplicates, Boolean needFullPath)
at System.IO.Path.GetFullPath(String path)
at CodeProject.FastDirectoryEnumerator.EnumerateFiles(String path, String searchPattern, SearchOption searchOption) in FastDirectoryEnumerator.cs: line 229
at CodeProject.FastDirectoryEnumerator.GetFiles(String path, String searchPattern, SearchOption searchOption) in FastDirectoryEnumerator.cs: line 250
at testFastDirectoryEnumeration.FileEnumerationTests.TestGetAllFilesFromCDrive2() in FileEnumerationTests.cs: line 49
|
|
|
|
|

|
Here is my workaround for this issue
Modify MoveNext() function (line 472) as following
if (m_hndFindFile == null)
{
if (m_path.Length <= 260)
{
new FileIOPermission(FileIOPermissionAccess.PathDiscovery, m_path).Demand();
}
string fixPath = @"\\?\" + m_path;
string searchPath = Path.Combine(fixPath, m_filter);
m_hndFindFile = FindFirstFile(searchPath, m_win_find_data);
retval = !m_hndFindFile.IsInvalid;
}
Thanks,
Quan
|
|
|
|

|
That works, but at the obvious cost of callers will be able to use this class to bypass path discovery security for any path that is longer than 260 characters. If this dll is called from a location that should not have that permission (the web for example, or a network share pre-3.5 SP1), then this could lead to an information leak/security vunerability. Whether that is important in your application is up to you.
|
|
|
|

|
In the method GetFiles, SearchOption.TopDirectoryOnly is being used instead of searchOption.
public static FileData[] GetFiles(string path, string searchPattern, SearchOption searchOption)
{
IEnumerable<FileData> e = FastDirectoryEnumerator.EnumerateFiles(path, searchPattern, SearchOption.TopDirectoryOnly);
List<FileData> list = new List<FileData>(e);
FileData[] retval = new FileData[list.Count];
list.CopyTo(retval);
return retval;
}
Even when I change the code to use searchOption and pass in SearchOption.AllDirectories it doesn't work as expected. I don't get the same values returned as Directory.GetFiles. I get only the files in the top directory (if any).
Any ideas?
modified on Sunday, August 23, 2009 9:51 AM
|
|
|
|

|
Same here. The real performance savings of this routine would be with the AllDirectories option, but it doesn't work. Sure runs fast though!
I tried the same fix Corey did, with the same results... no change. I'm running WinXP Pro SP3.
Update: OK, it looks like the other line that needs changing is the test for FileAttributes.Directory in MoveNext, so change this:if ((FileAttributes)m_win_find_data.dwFileAttributes == FileAttributes.Directory) to this:if (((FileAttributes)m_win_find_data.dwFileAttributes & FileAttributes.Directory) == FileAttributes.Directory) and all is well. Still runs like lightning!
modified on Sunday, August 23, 2009 10:59 PM
|
|
|
|

|
Good catch. I've submitted new code with this fix, and it should be available shortly. Thanks for the help.
Update: New version is now posted.
modified on Thursday, August 27, 2009 8:44 AM
|
|
|
|

|
Recursion doesn't seem to be working at all now...
|
|
|
|

|
How so? I tried it on my machine and it works fine for me. Are you getting an exception or just not seeing all the files? Note that the overload EnumerateFiles with no parameters only enumerates the current directory. You need to explicitly call it with a SearchOption if you want to enumerate sub-directories.
|
|
|
|

|
I've done small amount of debugging. It seems the problem occurs if you specify a search pattern. Child folders aren't searched unless they have the same extension as the search pattern.
|
|
|
|

|
Any chance you've fixed this?
|
|
|
|

|
Thanks for pointing this out. I've submitted a fix for this issue. It may be a few days before it appears however.
EDIT: Update is now available.
modified on Thursday, September 10, 2009 10:26 AM
|
|
|
|

|
The article looks clean and the code is well documented - got my 5.
I was going to recommend implementing additional features to take advantage of predicates/actions much like a List does. Though after scanning your code, it might not be as simple to implement or really buy you much.
But either way, good stuff - thanks for sharing.
|
|
|
|

|
@wilsone8
Did you also look at FSO (FileSystemObject) enumeration?
I don't expect it to best your best times, but I was curious if you had tested other, out-of-the-box methods.
|
|
|
|

|
Good one. I never thought I needed this until I see it.
I really never look at the performance differences of using DirectoryInfo.GetFiles and GetDirectories, I always thought they were very small one, not such big ones.
Good article.
|
|
|
|

|
well done. Thinking outside the box, or the Framework. I like it.
Luc Pattyn [Forum Guidelines] [My Articles]
The quality and detail of your question reflects on the effectiveness of the help you are likely to get.
Show formatted code inside PRE tags, and give clear symptoms when describing a problem.
|
|
|
|

|
Luc Pattyn wrote: well done. Thinking outside the box, or the Framework. I like it.
If you close your eyes on some of the technical side of handling files in .NET, then you have a point.
>> ..but it suffers from some very poor performance characteristics:
>> 1. GetFiles must allocate a potentially very large array.
True that is why they are including the new method. Directory.EnumerateFiles.
>> 2. GetFiles must wait for the entire directory's entries to be returned before returning.
The same as 1, so same problem.
>> 3. 3.For each file, a potentially expensive query is sent to the file system. No attempt is made to perform any sort of batch query.
If that means; file size, file dates, then true. Other than that there is nothing special here.
This article boast of speed, but ignores one essential part of .NET file systems, security. There is no single attempt to demand/check file security, which is done extensively in the Directory.GetFiles.
>> Sadly, it will still only return file names...
Sadly, this is misinformation and lack of understanding...
1. Directory.EnumerateFiles() is designed to return only the file names, which is required by most applications, and will be the replacement of Directory.GetFiles(). This essentially prevent creating useless classes; FileInfo or FileData (if you prefer that).
2. DirectoryInfo.EnumerateFiles(), is designed for those wanting more information on the files, and it is the replacement of the DirectoryInfo.GetFiles, and unlike the DirectoryInfo.GetFiles(), this does not use the Directory.GetFiles internally.
Basically, this article like many codes out there, including the MSDN version, is to reduce the amount of memory required when dealing with large files, and knowing the environment you are using it.
I have used the MSDN version, removed the FileInfo it returns for just the file name, when creating a tool for Sandcastle.
Best regards,
Paul.
Jesus Christ is LOVE! Please tell somebody.
modified on Friday, August 14, 2009 6:33 AM
|
|
|
|

|
Paul Selormey wrote: If that means; file size, file dates, then true. Other than that there is nothing special here.
Getting additional attributes for each file is the whole point of this code! If all you need is file names, then of course this is not interesting. I'm pointing out that if you need file attributes and not just names, then the .Net Framework's built in methods are not the most efficient way to go.
Paul Selormey wrote: This article boast of speed, but ignores one essential part of .NET file systems, security. There is no single attempt to demand/check file security, which is done extensively in the Directory.GetFiles.
This is true in v1 of this article. I've added the same security checks that the .Net framework does and the performance doesn't change at all.
Paul Selormey wrote: 2. DirectoryInfo.EnumerateFiles(), is designed for those wanting more information on the files, and it is the replacement of the DirectoryInfo.GetFiles, and unlike the DirectoryInfo.GetFiles(), this does not use the Directory.GetFiles internally.
If it works anything like the current DirectoryInfo.GetFileSystemInfos, then like I said above it will be no faster than creating a bunch of FileInfo objects. Its not returning an enumeration that makes my method faster. In fact, just to prove this I've added a GetFiles method to the FastDirectoryEnumerator that returns an array of FileData objects. The two methods are always within 5% of each other.
The problem with DirectoryInfo.GetFileSystemInfos (or any of the methods that return a FileInfo object) is that internally the FileInfo objects only stores a file name on construction. All the other information that is returned by FindFirstFile/FindNextFile is thrown away and than re-queried when you request the first attribute of a file. By keeping that data around my code is significantly faster, especially in the face of network latencies. Maybe in .Net 4.0 this is changed (I don't have Beta 1 of 4.0 installed; I don't play around with Beta software), but in the mean time I think this code is useful for some people.
modified on Friday, August 14, 2009 2:53 PM
|
|
|
|

|
wilsone8 wrote: Getting additional attributes for each file is the whole point of this code!...
First of all, I do think this article is useful for many uses, and as I have said, I have used a similar code, which appeared in the MSDN mag.
wilsone8 wrote: This is true in v1 of this article. I've added the same security checks that the .Net framework does and the performance doesn't change at all.
Lets see how it performs, hope you will post the update soon. Without a security check/demand, and no word of caution users could run into problems.
wilsone8 wrote: I don't have Beta 1 of 4.0 installed; I don't play around with Beta software
Then, it was not right making statements about it, especially not making the difference between the two methods provided by the .NET and what they are supposed to do.
I have .NET 4.0 beta 1 installed on a VPC, and play with it, and with Reflector, it was easy to see how it works.
Even now, FileInfo/FileSystemInfo internally uses the Win32 data, what was lacking was the right iterator, and .NET 4.0 provides at least three of them, through a factory. Unlike the .NET 2.x, FileSystemInfo now has internal method, InitializeFrom(Win32Native.WIN32_FIND_DATA findData), so yes there is a difference.
wilsone8 wrote: ...but in the mean time I think this code is useful for some people.
It is useful, and I will use it when I have the need, just avoid the extra unverified information.
NB: If you are only looking for files use the DirectoryInfo.GetFiles instead of the DirectoryInfo.GetFileSystemInfos to avoid the extra
((files[i].Attributes & FileAttributes.Directory) == 0) checks.
Best regards,
Paul.
Jesus Christ is LOVE! Please tell somebody.
|
|
|
|
 |
|
|
General News Suggestion Question Bug Answer Joke Rant Admin
Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.
|
Describes how to create a significantly faster enumerator for the attributes of all the files in a directory.
| Type | Article |
| Licence | CPOL |
| First Posted | 13 Aug 2009 |
| Views | 62,221 |
| Downloads | 4,641 |
| Bookmarked | 120 times |
|
|