Click here to Skip to main content
Click here to Skip to main content

Converting Wildcards to Regexes

By , 15 Sep 2005
 

Introduction

Ever wondered how to do wildcards in .NET? It's not hard, all you have to do is use regular expressions. But it's not always easy to figure it out either. I found that I had to dig around for a while to figure out how to do it properly.

Even though regexes are a lot more powerful, wildcards are still good in situations where you can't expect the user to know or learn the cryptic syntax of regexes. The most obvious example is in the file search functionality of practically all OSs -- there aren't many that don't accept wildcards. I personally need wildcards to handle the HttpHandlers tag in web.config files.

Note: This method is good enough for most uses, but if you need every ounce of performance with wildcards, here is a good place to start.

Using the Code

There are three steps to converting a wildcard to a regex:

  1. Escape the pattern to make it regex-safe. Wildcards use only * and ?, so the rest of the text has to be converted to literals.
  2. Once escaped, * becomes \* and ? becomes \?, so we have to convert \* and \? to their respective regex equivalents, .* and ..
  3. Prepend ^ and append $ to specify the beginning and end of the pattern.

So, here's the golden function:

public static string WildcardToRegex(string pattern)
{
  return "^" + Regex.Escape(pattern).
  Replace("\\*", ".*").
  Replace("\\?", ".") + "$";
}

Just to make it look cool, I wrapped it in a Wildcard class that inherits Regex.

/// <summary>
/// Represents a wildcard running on the
/// <see cref="System.Text.RegularExpressions"/> engine.
/// </summary>
public class Wildcard : Regex
{
 /// <summary>
 /// Initializes a wildcard with the given search pattern.
 /// </summary>
 /// <param name="pattern">The wildcard pattern to match.</param>
 public Wildcard(string pattern)
  : base(WildcardToRegex(pattern))
 {
 }
 
 /// <summary>
 /// Initializes a wildcard with the given search pattern and options.
 /// </summary>
 /// <param name="pattern">The wildcard pattern to match.</param>
 /// <param name="options">A combination of one or more
 /// <see cref="System.Text.RegexOptions"/>.</param>
 public Wildcard(string pattern, RegexOptions options)
  : base(WildcardToRegex(pattern), options)
 {
 }
 
 /// <summary>
 /// Converts a wildcard to a regex.
 /// </summary>
 /// <param name="pattern">The wildcard pattern to convert.</param>
 /// <returns>A regex equivalent of the given wildcard.</returns>
 public static string WildcardToRegex(string pattern)
 {
  return "^" + Regex.Escape(pattern).
   Replace("\\*", ".*").
   Replace("\\?", ".") + "$";
 }
}

You can use it like any other Regex -- case-(in)sensitivity, string replacement, matching and all.

// Get a list of files in the My Documents folder
string[] files = System.IO.Directory.GetFiles(
 System.Environment.GetFolderPath(
 Environment.SpecialFolder.Personal));

// Create a new wildcard to search for all
// .txt files, regardless of case
Wildcard wildcard = new Wildcard("*.txt", RegexOptions.IgnoreCase);

// Print all matching files
foreach(string file in files)
 if(wildcard.IsMatch(file))
  Console.WriteLine(file);

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

About the Author

Rei Miyasaka
Canada Canada
Member
The cows are here to take me home now...

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
QuestionThanks - just in timemembersnoopy00116 May '13 - 4:31 
From 2005 heh - oldies but goodies.
Thanks for sharing this.
GeneralMy vote of 5memberChristopher Drake2 Jan '13 - 9:05 
This is exactly what I needed. Doing some work with Win32 file functions, this helped with the pattern matching.
GeneralMy vote of 5memberMember 892678823 Sep '12 - 23:33 
Excellent work... thank you very much
QuestionDoes not work with *.*memberguest2323113 Sep '12 - 3:23 
Usually, the wildcard of *.* means "everything" and not "everything that contains a point", at least under windows when wildcard are used related to file systems.
 
So if you want some kind of "windows-file-system-wildcard-analogue", you should add somewhere:
if (wildcard == "*.*") wildcard = "*";

GeneralMy vote of 5memberKuthuparakkal12 Sep '12 - 17:03 
great stuff
QuestionWord wild card matching behavior a little different from this [modified]memberweciii2 Mar '12 - 5:21 
Test string:
"she sells sea shells by the sea shore"
 
Wild card find pattern to type in:
"s*a"
 
To find in Word:
Open Word 2010 and paste the text.
On the home tab in the far right, select the find drop down and select advanced find. In the resulting dialog type "s*a". Click the more button and check use wildcards. Click find next. It will find "she sells sea" as the first match.
 
The regex pattern generated for s*a is "^s.*a$"
 
If you test that regex pattern, it comes back with 0 matches.
 
The current regex pattern looks like it will only get a match when 's' is at the beginning of the string or line and 'a' is at the end of the string or line.
I'm not too good with regex and could use a solution that would find the pattern anywhere in the string. I've tried a few modifications to the existing regex pattern, but desired result not reached yet.
 
****Update****
Found what I was looking for.
Changing the code to the following did the trick:
        
        public static string WildcardToRegex(string pattern)
        {
            return //"^" + 
                Regex.Escape(pattern).
             Replace("\\*", ".*?").
             Replace("\\?", ".");// +"$";
        }
 
Removed '^', and '$' which says matches need to be at beginning and end of string or line. Changed ".*" to ".*?" - turned 'greedy quantifier' into 'lazy quantifier'. After that, it will still come back with only 2 matches. To compensate for that you could search the string multiple times bumping the start point of the search each time like below:
            Wildcard w = new Wildcard("s*a", RegexOptions.CultureInvariant);
 
            string s = "she sells sea shells by the sea shore";
            int length = s.Length;
            int index = 0;
 
            Match m = w.Match(s);
            StringBuilder sb = new StringBuilder();
 
            bool bsuccess = m.Success;
            while (bsuccess)
            {
                sb.AppendLine(m.Value);
                index = s.IndexOf(m.Value, index) + 1;
                if (index < length)
                {
                    m = w.Match(s, index);
                    bsuccess = m.Success;
                }
                else
                    bsuccess = false;
            }
which gives 7 results like Word does:
she sells sea
sells sea
s sea
sea
shells by the sea
s by the sea
sea

modified 5 Mar '12 - 9:30.

GeneralMy vote of 5membervet0n23 Feb '12 - 21:46 
Great article!
Would be nice to see some tricky unit-tests for this as well.
SuggestionAdding More Static MethodsmemberSina Iravanian14 Dec '11 - 2:21 
Hi,
 
Thanks for the code. I liked it, especially the fact that it was derived from Regex. I use Regex static methods a lot, and hence added these methods to your Wildcard class so that its interface matches more the .NET's Regex class. Here are these methods:
 
public static Match Match(string input, string pattern, RegexOptions options)
{
    string wildcardPattern = WildcardToRegex(pattern);
    return Regex.Match(input, wildcardPattern, options);
}
 
public static Match Match(string input, string pattern)
{
    return Match(input, pattern, RegexOptions.None);
}
 
public static bool IsMatch(string input, string pattern, RegexOptions options)
{
    string wildcardPattern = WildcardToRegex(pattern);
    return Regex.IsMatch(input, wildcardPattern, options);
}
 
public static bool IsMatch(string input, string pattern)
{
    return IsMatch(input, pattern, RegexOptions.None);
}
 
public static MatchCollection Matches(string input, string pattern, RegexOptions options)
{
    string wildcardPattern = WildcardToRegex(pattern);
    return Regex.Matches(input, wildcardPattern, options);
}
 
public static MatchCollection Matches(string input, string pattern)
{
    return Regex.Matches(input, pattern, RegexOptions.None);
}

QuestionRe: Adding More Static MethodsmemberDontSailBackwards6 May '13 - 14:47 
Are the two-parameter methods for matching wildcards missing the wildcardPattern? Wouldn't they just do a RegEx match?
 
string wildcardPattern = WildcardToRegex(pattern);
return Regex.BlahBlah(input, wildcardPattern, options);
www.CADbloke.com
The Broadcast Systems Documentation SYSTEM
 
"The mass of men lead lives of quiet desperation"
-Zen & the Art of Motorcycle Maintenance

GeneralMy vote of 5memberNagy Vilmos8 Nov '10 - 2:24 
The method WildcardToRegex has just saved me from having to write a very messy search class. TVM.

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web02 | 2.6.130523.1 | Last Updated 15 Sep 2005
Article Copyright 2005 by Rei Miyasaka
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid