Click here to Skip to main content
Click here to Skip to main content

Wildcard string compare (globbing)

By , 15 Feb 2005
 

Usage:

This is a fast, lightweight, and simple pattern matching function.

if (wildcmp("bl?h.*", "blah.jpg")) {
  //we have a match!
} else {
  //no match =(
}

Function:

int wildcmp(const char *wild, const char *string) {
  // Written by Jack Handy - <A href="mailto:jakkhandy@hotmail.com">jakkhandy@hotmail.com</A>
  const char *cp = NULL, *mp = NULL;

  while ((*string) && (*wild != '*')) {
    if ((*wild != *string) && (*wild != '?')) {
      return 0;
    }
    wild++;
    string++;
  }

  while (*string) {
    if (*wild == '*') {
      if (!*++wild) {
        return 1;
      }
      mp = wild;
      cp = string+1;
    } else if ((*wild == *string) || (*wild == '?')) {
      wild++;
      string++;
    } else {
      wild = mp;
      string = cp++;
    }
  }

  while (*wild == '*') {
    wild++;
  }
  return !*wild;
}

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

About the Author

Jack Handy
Web Developer
United States United States
Member
No Biography provided

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
Questionhelp required for wilcard matching * and #memberSaimaAsif23 Feb '12 - 23:56 
I read this article " ,its really amazing. I appreciate your efforts. I am student, I need help in defining the same kind of function according to my requirements. I hope, I 'll get good response.
 
Words are strings which are separated by dots. Two additional characters are also valid i.e:The *, which matches 1 word and the #, which matches 0..N words Example: *.stock.# matches the routing keys usd.stock and eur.stock.dsf but not stock.nasdaq.
 

Your help would be highly appreciated.
Sam

GeneralMy vote of 5memberPlamen Petrov13 Dec '11 - 21:37 
A very useful function!
SuggestionModification with '#' as wildcard joker for digits [modified]memberThomas Haase25 Sep '11 - 23:16 
First of all I like this code, it is small and fully stand-alone.
I have modified it, because I need an additional wildcard joker that represents digits. Finally the modified function accepts '*', '?' and '#' as joker characters.
 
int wildcmp_ex(const char *wild, const char *string) {
  const char *cp = NULL, *mp = NULL;
 
  while (*string) {
    if (*wild == '*') {
      if (!*++wild) {
        return 1;
      }
      mp = wild;
      cp = string+1;
    } else if (((*wild == *string) && (*wild != '#')) || (*wild == '?') || ((*wild == '#') && isdigit(*string))) {
      wild++;
      string++;
    } else {
      if (mp)
      {
        wild = mp;
        string = cp++;
      }
      else
      {
        return 0;
      }
    }
  }
 
  while (*wild == '*') {
    wild++;
  }
  return !*wild;
}
Thomas Haase


modified 29 Sep '11 - 8:26.

QuestionLicence Questionmemberrandommark23 Nov '10 - 0:33 
Hi Jack Handy,
 
Is there a licence attached to this code?
 
Thanks, Mark
AnswerAnother C# version, with a twistmembertomlev29 Jun '10 - 14:50 
Just for fun... a C# version with almost the same syntax as the original C version Smile | :)
 
public static bool wildcmp(string pattern, string text) {
 
  var wild = new StringScanner(pattern);
  var @string = new StringScanner(text);
 
  var mp = wild;
  var cp = @string;
 
  while (@string && wild != '*') {
    if (wild != @string && wild != '?') {
      return false;
    }
    wild++;
    @string++;
  }
 
  while (@string) {
    if (@wild == '*') {
      if (!++wild) {
        return true;
      }
      mp = wild;
      cp = @string + 1;
    } else if (wild == @string || wild == '?') {
      wild++;
      @string++;
    } else {
      wild = mp;
      @string = cp++;
    }
  }
 
  while (wild == '*') {
    wild++;
  }
  return !wild;
}
 
public struct StringScanner
{
    private string _string;
    private int _position;
    
    public StringScanner(string s)
    {
        _string = s;
        _position = 0;
    }
    
    public string String
    {
        get { return _string; }
    }
    
    public int Position
    {
        get { return _position; }
    }
        
    public bool Finished
    {
        get { return _position == _string.Length;}
    }
    
    public char Current
    {
        get { return Finished ? '\0' : _string[_position]; }
    }
    
    public bool MoveNext()
    {
        if (Finished)
            return false;
        _position++;
        return true;
    }
    
    public static StringScanner operator ++(StringScanner scanner)
    {
        scanner.MoveNext();
        return scanner;
    }
    
    public static StringScanner operator +(StringScanner scanner, int n)
    {
        return new StringScanner(scanner.String)
        {
            _position = Math.Min(scanner.Position + n, scanner.String.Length)
        };
    }
    
    public static implicit operator bool(StringScanner scanner)
    {
        return !scanner.Finished;
    }
    
    public static implicit operator char(StringScanner scanner)
    {
        return scanner.Current;
    }
    
    public static bool operator ==(StringScanner scanner1, StringScanner scanner2)
    {
        return scanner1.Current == scanner2.Current;
    }
    
    public static bool operator !=(StringScanner scanner1, StringScanner scanner2)
    {
        return scanner1.Current != scanner2.Current;
    }
}
My blog : in English - in French

GeneralObscuritymemberChuck O'Toole25 Apr '10 - 18:18 
I've been using this for years, just don't show it to your instructor.
 
// String match with wildcards.  Obtained from the Internet somewhere.  Case insensitive.

BOOL wm(const char *s, const char *t)
{
	return *t-'*' ? *s ? (*t=='?') | (toupper(*s)==toupper(*t)) && wm(s+1,t+1) : !*t : wm(s,t+1) || *s && wm(s+1,t);
}
 
If you want case sensitive, remove the toupper() calls.
AnswerMy C# contribution - recursive, of course!memberRenniePet26 Mar '10 - 5:21 
This strikes me as an obvious place to use recursion. So here goes...
 
   public class MString
   {
      /// <summary>
      /// Function to compare two strings, where strA may contain wildcard characters '*' and 
      /// '?'. http://en.wikipedia.org/wiki/Wildcard_character
      /// </summary>
      /// <param name="strA">string which may contain wildcards, may be empty, must not be null</param>
      /// <param name="strB">string to compare to, no wildcard processing, may be empty, must not be null</param>
      /// <param name="ignoreCase">true = ignore upper/lower case, false = don't ignore case</param>
      /// <returns>true = match, false = non-match</returns>
      public static bool CompareWWc(string strA, string strB, bool ignoreCase)
      {
         if (ignoreCase)
            return CompareWWc(strA.ToLower(), strB.ToLower());
         else 
            return CompareWWc(strA, strB);
      }
 

      /// <summary>
      /// Recursive function to compare two strings, where strA may contain wildcard characters 
      /// '*' and '?'. http://en.wikipedia.org/wiki/Wildcard_character
      /// </summary>
      /// <param name="strA">string which may contain wildcards, may be empty, must not be null</param>
      /// <param name="strB">string to compare to, no wildcard processing, may be empty, must not be null</param>
      /// <returns>true = match, false = non-match</returns>
      public static bool CompareWWc(string strA, string strB)
      {
         // Top of loop to scan across strA (and strB)
         for (int i = 0; i < strA.Length; i++)
         {
            // Special processing when we hit a '*' in strA
            if (strA[i] == '*')
            {
               // If the '*' is at the end of strA then result = true irrespective of strB
               if (i == strA.Length - 1)
                  return true;  
 
               // Do recursive calls to try to find a match somewhere to the right in strB
               strA = strA.Substring(i + 1);  // The part of strA beyond the '*'
               for (int j = i; j < strB.Length; j++)
                  if (CompareWWc(strA, strB.Substring(j)))
                     return true;
               return false;
            }
 
            // Normal processing for non-'*' characters in strA
            if (i >= strB.Length || (strA[i] != strB[i] && strA[i] != '?'))
               return false;
         }
 
         // We've reached the end of strA and the last character is not '*'
         return strA.Length == strB.Length;
      }
 
   }
 
And here's a little test sequence:
 
         if (!MString.CompareWWc("", ""))
            Console.WriteLine("Something wrong!");
 

         if (!MString.CompareWWc("something", "something"))
            Console.WriteLine("Something wrong!");
 
         if (MString.CompareWWc("something", "zomething"))
            Console.WriteLine("Something wrong!");
         
         if (MString.CompareWWc("something", "some"))
            Console.WriteLine("Something wrong!");
         
         if (MString.CompareWWc("something", "something else"))
            Console.WriteLine("Something wrong!");
 

         if (!MString.CompareWWc("s?m?th???", "something"))
            Console.WriteLine("Something wrong!");
         
         if (MString.CompareWWc("s?m?th???", "somethin"))
            Console.WriteLine("Something wrong!");
 

         if (!MString.CompareWWc("*", ""))
            Console.WriteLine("Something wrong!");
         
         if (!MString.CompareWWc("*", "nonsense"))
            Console.WriteLine("Something wrong!");
         
         if (!MString.CompareWWc("non*", "nonsense"))
            Console.WriteLine("Something wrong!");
 

         if (!MString.CompareWWc("*nonsense", "nonsense"))
            Console.WriteLine("Something wrong!");
 
         if (!MString.CompareWWc("non*nse", "nonsense"))
            Console.WriteLine("Something wrong!");
         
         if (MString.CompareWWc("non*nse", "nonsenze"))
            Console.WriteLine("Something wrong!");
         
         if (!MString.CompareWWc("non*n?e", "nonsense"))
            Console.WriteLine("Something wrong!");
 

         if (!MString.CompareWWc("n*on*nse", "nonsense"))
            Console.WriteLine("Something wrong!");
 
         if (!MString.CompareWWc("n*n*nse", "nonsense"))
            Console.WriteLine("Something wrong!");
 
         if (MString.CompareWWc("*non*nse", "nonsenze"))
            Console.WriteLine("Something wrong!");
 
         if (!MString.CompareWWc("n*n*n?e", "nonsense"))
            Console.WriteLine("Something wrong!");
      }
 
By the way, the name CompareWWc means Compare With Wildcards.
GeneralRe: My C# contribution - recursive, of course!memberErwin de GRoot29 Mar '10 - 1:58 
Actually, the recursive function together with substring will make this slow.
I'm using this at the moment:
    public static class StringExtensions
    {
        public static bool WildcardMatch(this string str, string compare, bool ignoreCase) 
        { 
            if (ignoreCase)
                return str.ToLower().WildcardMatch(compare.ToLower()); 
            else
                return str.WildcardMatch(compare); 
        }
 
        public static bool WildcardMatch(this string str, string compare)
        {
            if (string.IsNullOrEmpty(compare))
                return str.Length == 0;
            int pS = 0;
            int pW = 0;
            int lS = str.Length;
            int lW = compare.Length;
            
            while (pS < lS && pW < lW && compare[pW] != '*')
            {
                char wild = compare[pW];
                if (wild != '?' && wild != str[pS])
                    return false;
                pW++;
                pS++;
            }
 
            int pSm = 0;
            int pWm = 0;
            while (pS < lS && pW < lW)
            {
                char wild = compare[pW];
                if (wild == '*')
                {
                    pW++;
                    if (pW == lW)
                        return true;
                    pWm = pW;
                    pSm = pS + 1;
                }
                else if (wild == '?' || wild == str[pS])
                {
                    pW++;
                    pS++;
                }
                else
                {
                    pW = pWm;
                    pS = pSm;
                    pSm++;
                }
            }
            while (pW < lW && compare[pW] == '*')
                pW++;
            return pW == lW && pS == lS; 
        }
    }

GeneralDepends on whether you need to optimize the last few nanoseconds out of it...memberRenniePet29 Mar '10 - 7:45 
Hi Erwin,
 
Thanks for your posting. It did make me decide to investigate the situation.
 
I still really think this is a situation that begs for recursion. But maybe you were right that substring is not a good idea. So I made this version:
 
   public class MString2
   {
      /// <summary>
      /// Function to compare two strings, where strA may contain wildcard characters '*' and 
      /// '?'. http://en.wikipedia.org/wiki/Wildcard_character
      /// </summary>
      /// <param name="strA">string which may contain wildcards, may be empty, must not be null</param>
      /// <param name="strB">string to compare to, no wildcard processing, may be empty, must not be null</param>
      /// <param name="ignoreCase">true = ignore upper/lower case, false = don't ignore case</param>
      /// <returns>true = match, false = non-match</returns>
      public static bool CompareWWc(string strA, string strB, bool ignoreCase)
      {
         if (ignoreCase)
            return CompareWWc(strA.ToLower(), 0, strB.ToLower(), 0);
         else
            return CompareWWc(strA, 0, strB, 0);
      }
 

      /// <summary>
      /// Function to compare two strings, where strA may contain wildcard characters '*' and 
      /// '?'. http://en.wikipedia.org/wiki/Wildcard_character
      /// </summary>
      /// <param name="strA">string which may contain wildcards, may be empty, must not be null</param>
      /// <param name="strB">string to compare to, no wildcard processing, may be empty, must not be null</param>
      /// <returns>true = match, false = non-match</returns>
      public static bool CompareWWc(string strA, string strB)
      {
         // Just call the private recursive version of this function
         return CompareWWc(strA, 0, strB, 0);
      }
 

      /// <summary>
      /// Private recursive function used by the above two public functions.
      /// </summary>
      /// <param name="strA">string which may contain wildcards, may be empty, must not be null</param>
      /// <param name="indexA">index into strA marking start of the string for processing purposes</param>
      /// <param name="strB">string to compare to, no wildcard processing, may be empty, must not be null</param>
      /// <param name="indexB">index into strB marking start of the string for processing purposes</param>
      /// <returns>true = match, false = non-match</returns>
      private static bool CompareWWc(string strA, int indexA, string strB, int indexB)
      {
         // Top of loop to scan across strA (and strB)
         for (int i = 0; indexA + i < strA.Length; i++)
         {
            // Special processing when we hit a '*' in strA
            if (strA[indexA + i] == '*')
            {
               // If the '*' is at the end of strA then result = true irrespective of strB
               if (indexA + i == strA.Length - 1)
                  return true;
 
               // Do recursive calls to try to find a match somewhere to the right in strB
               for (int j = indexB + i; j < strB.Length; j++)
                  if (CompareWWc(strA, indexA + i + 1, strB, j))
                     return true;
               return false;
            }
 
            // Normal processing for non-'*' characters in strA
            if (indexB + i >= strB.Length || (strA[indexA + i] != strB[indexB + i] && strA[indexA + i] != '?'))
               return false;
         }
 
         // We've reached the end of strA and there is no '*' in strA
         return strA.Length - indexA == strB.Length - indexB;
      }
      
   }
 
Then I ran some timing tests, using System.Diagnostics.Stopwatch. I put my test case with 19 calls to the function in a loop and executed it 10,000 times. I did this for my original version, your version, and my new version. I compiled the programs in Release mode.
 
Assuming I haven't made a mistake somewhere, here are my results for a single function call:
 
My original version:  342 nonoseconds
Your version:         237 nanoseconds
My second version:    279 nanoseconds
Now to tell you the truth, I find it very difficult to get excited about saving 100 nanoseconds at the expense of having two and a half times as many lines of code. Especially since my expected use of this function in my application will probably never exceed a couple hundred calls per day. Smile | :)
 
Anyway, thanks for getting me to think things over again and make the tests. Personally, at least in this particular case, I prefer programmer understandability to execution efficiency. I've decided to stick with my original version, since I think my second version is more difficult to understand, and the improved efficiency not worth that disadvantage.
GeneralSorry - revised numbersmemberRenniePet29 Mar '10 - 8:35 
Hi Erwin,
 
Sorry - my previous numbers are not correct. I was running the programs under the Visual Studio debugger, and that was apparently not good for timing tests.
 
Here's what I get now:
 
My original version:  243 nonoseconds
Your version:          76 nanoseconds
My second version:    111 nanoseconds
Assuming these timings are valid, your version is three times faster than my original version, and that is pretty significant, at least in a situation were the function may be used millions times a day.
 
Sorry for the incorrect timings in my previous posting.

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web01 | 2.6.130523.1 | Last Updated 15 Feb 2005
Article Copyright 2001 by Jack Handy
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid