This article describes how to use
CFileFindEx, a class based on the MFC class
CFileFind that allows you to specify file filters that control which files are returned. The class plugs in wherever a
CFileFind class was used previously and is pretty easy to understand. The programmer can use either DOS wildcard or ATL regular expression based includes or excludes.
My company, Rimage, manufactures high end CD and DVD production equipments. The software I work on specifically is called QuickDisc. It is the interface that helps desktop users collect files using drag and drop and select the options that they want to use when creating discs. Our equipment automatically loads the disc drives and prints on the disc using a printer that we have developed. One of the problems with QuickDisc was that it didn't make a very good back up program because it always selected all the files from folders that were dropped on the UI. The class included in this article was created for an upcoming version to allow us to filter the files based on user input. The class seemed useful and general enough that I thought others may have a use for it.
Using the code
The project download includes the source code for the class and a program that implements its features. The class is simple to use. The programmer can define one or more include or exclude filters in a format similar to the following example:
Note that each filter is separated by a vertical bar '|' character. This character was used because it is not valid for DOS files. If the above is included as an "Include" filter only files with extensions .cpp, .h, or .rc will be returned. Similarly, if the above filter is specified as an "Exclude" filter all files except those of type .cpp, .h, or .rc will be returned.
Note: Folders are always returned regardless of their name or extension.
Alternatively, Include and Exclude filters may be specified using a subset of the typical regular expression syntax as defined in the ATL class
CAtlRegExp. The following syntax is allowed:
|.||Matches any single character.|
|[ ]||Indicates a character class. Matches any character inside the brackets (for example, [abc] matches "a", "b", and "c").|
If this metacharacter occurs at the start of a character class, it negates the character class. A negated character class matches any character except those inside the brackets (for example, [^abc] matches all characters except "a", "b", and "c").
^ is at the beginning of the regular expression, it matches the beginning of the input (for example, ^[abc] will only match input that begins with "a", "b", or "c").
In a character class, indicates a range of characters (for example, [0-9] matches any of the digits "0" through "9").
Indicates that the preceding expression is optional: it matches once or not at all (for example, [0-9][0-9]? matches "2" and "12").
Indicates that the preceding expression matches one or more times (for example, [0-9]+ matches "1", "13", "666", and so on).
Indicates that the preceding expression matches zero or more times.
??, +?, *?
Non-greedy versions of ?, +, and *. These match as little as possible, unlike the greedy versions which match as much as possible. Example: given the input "<abc><def>", <.*?> matches "<abc>" while <.*> matches "<abc><def>".
Escape character: interpret the next character literally (for example, [0-9]+ matches one or more digits, but [0-9]\+ matches a digit followed by a plus character). Also used for abbreviations (such as \a for any alphanumeric character; see table below).
At the end of a regular expression, this character matches the end of the input. Example: [0-9]$ matches a digit at the end of the input.
Alternation operator: separates two expressions, exactly one of which matches (for example, T|the matches "The" or "the").
Negation operator: the expression following
! does not match the input. Example: a!b matches "a" not followed by "b".
Any alphanumeric character: ([a-zA-Z0-9])
White space (blank): ([ \\t])
Any alphabetic character: ([a-zA-Z])
Any decimal digit: ([0-9])
Any hexadecimal digit: ([0-9a-fA-F])
A quoted string: (\"[^\"]*\")|(\'[^\']*\')
A simple word: ([a-zA-Z]+)
An integer: ([0-9]+)
For example, the previous filter list could be entered as:
Regular expressions are more trouble to define and only available as an artifact of the way the class does its matching, but some useful expressions could be done:
Would only include (or exclude) files that start with a, b, c, or d and have the extension .cpp. Matches done with this class are always case insensitive.
The class is implemented as in the following example. Don't forget to include filefindex.h.
CString csFilePath = _T("C:\\TestFiles");
// Include all .doc, .xls, or .ppt files.
CString csIncludeFilter = _T("*.doc|*.xls|*.ppt");
// Note don't want .doc files that start with Tom
CString csExcludeFilter = _T("Tom*.doc");
// Check for files based on the critera the user filled out.
BOOL bWorked = fileInfo.FindFile(csFilePath,csIncludeFilter,
bWorked = fileInfo.FindNextFile();
// Do something with the files...
Points of interest
Anyone who has used
CFileFind knows that it is annoying to have to call
FindNextFile() before ever even using the file first found in the first call to
FindFile(). In trying to make this class work as much like the original as possible, it was tricky to duplicate this illogical logic. However, I think I figured out a way to do it and it seems to work without running any slower than the original in my tests. I've only started using this class so it is pretty new, but I've tested it quite a bit using the included sample program to make sure that my version was just as annoying as the original since we've all gotten kind of used to it.
- September 13th, 2005 - Version 1.0.
- September 22nd, 2005 - Version 1.1.
- Fixed problem with calling
FindFile() a second (or more) times.
- Added Unicode support to class and demo.