Click here to Skip to main content
Click here to Skip to main content

Wildcard string compare (globbing)

By , 15 Feb 2005
 

Usage:

This is a fast, lightweight, and simple pattern matching function.

if (wildcmp("bl?h.*", "blah.jpg")) {
  //we have a match!
} else {
  //no match =(
}

Function:

int wildcmp(const char *wild, const char *string) {
  // Written by Jack Handy - <A href="mailto:jakkhandy@hotmail.com">jakkhandy@hotmail.com</A>
  const char *cp = NULL, *mp = NULL;

  while ((*string) && (*wild != '*')) {
    if ((*wild != *string) && (*wild != '?')) {
      return 0;
    }
    wild++;
    string++;
  }

  while (*string) {
    if (*wild == '*') {
      if (!*++wild) {
        return 1;
      }
      mp = wild;
      cp = string+1;
    } else if ((*wild == *string) || (*wild == '?')) {
      wild++;
      string++;
    } else {
      wild = mp;
      string = cp++;
    }
  }

  while (*wild == '*') {
    wild++;
  }
  return !*wild;
}

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

About the Author

Jack Handy
Web Developer
United States United States
No Biography provided

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
GeneralMy vote of 5memberFranc Morales29-May-13 15:47 
Thanks for sharing, my friend.
Questionhelp required for wilcard matching * and #memberSaimaAsif23-Feb-12 23:56 
I read this article " ,its really amazing. I appreciate your efforts. I am student, I need help in defining the same kind of function according to my requirements. I hope, I 'll get good response.
 
Words are strings which are separated by dots. Two additional characters are also valid i.e:The *, which matches 1 word and the #, which matches 0..N words Example: *.stock.# matches the routing keys usd.stock and eur.stock.dsf but not stock.nasdaq.
 

Your help would be highly appreciated.
Sam

GeneralMy vote of 5memberPlamen Petrov13-Dec-11 21:37 
A very useful function!
SuggestionModification with '#' as wildcard joker for digits [modified]memberThomas Haase25-Sep-11 23:16 
First of all I like this code, it is small and fully stand-alone.
I have modified it, because I need an additional wildcard joker that represents digits. Finally the modified function accepts '*', '?' and '#' as joker characters.
 
int wildcmp_ex(const char *wild, const char *string) {
  const char *cp = NULL, *mp = NULL;
 
  while (*string) {
    if (*wild == '*') {
      if (!*++wild) {
        return 1;
      }
      mp = wild;
      cp = string+1;
    } else if (((*wild == *string) && (*wild != '#')) || (*wild == '?') || ((*wild == '#') && isdigit(*string))) {
      wild++;
      string++;
    } else {
      if (mp)
      {
        wild = mp;
        string = cp++;
      }
      else
      {
        return 0;
      }
    }
  }
 
  while (*wild == '*') {
    wild++;
  }
  return !*wild;
}
Thomas Haase


modified 29-Sep-11 8:26am.

QuestionLicence Questionmemberrandommark23-Nov-10 0:33 
Hi Jack Handy,
 
Is there a licence attached to this code?
 
Thanks, Mark
AnswerAnother C# version, with a twistmembertomlev29-Jun-10 14:50 
Just for fun... a C# version with almost the same syntax as the original C version Smile | :)
 
public static bool wildcmp(string pattern, string text) {
 
  var wild = new StringScanner(pattern);
  var @string = new StringScanner(text);
 
  var mp = wild;
  var cp = @string;
 
  while (@string && wild != '*') {
    if (wild != @string && wild != '?') {
      return false;
    }
    wild++;
    @string++;
  }
 
  while (@string) {
    if (@wild == '*') {
      if (!++wild) {
        return true;
      }
      mp = wild;
      cp = @string + 1;
    } else if (wild == @string || wild == '?') {
      wild++;
      @string++;
    } else {
      wild = mp;
      @string = cp++;
    }
  }
 
  while (wild == '*') {
    wild++;
  }
  return !wild;
}
 
public struct StringScanner
{
    private string _string;
    private int _position;
    
    public StringScanner(string s)
    {
        _string = s;
        _position = 0;
    }
    
    public string String
    {
        get { return _string; }
    }
    
    public int Position
    {
        get { return _position; }
    }
        
    public bool Finished
    {
        get { return _position == _string.Length;}
    }
    
    public char Current
    {
        get { return Finished ? '\0' : _string[_position]; }
    }
    
    public bool MoveNext()
    {
        if (Finished)
            return false;
        _position++;
        return true;
    }
    
    public static StringScanner operator ++(StringScanner scanner)
    {
        scanner.MoveNext();
        return scanner;
    }
    
    public static StringScanner operator +(StringScanner scanner, int n)
    {
        return new StringScanner(scanner.String)
        {
            _position = Math.Min(scanner.Position + n, scanner.String.Length)
        };
    }
    
    public static implicit operator bool(StringScanner scanner)
    {
        return !scanner.Finished;
    }
    
    public static implicit operator char(StringScanner scanner)
    {
        return scanner.Current;
    }
    
    public static bool operator ==(StringScanner scanner1, StringScanner scanner2)
    {
        return scanner1.Current == scanner2.Current;
    }
    
    public static bool operator !=(StringScanner scanner1, StringScanner scanner2)
    {
        return scanner1.Current != scanner2.Current;
    }
}
My blog : in English - in French

GeneralObscuritymemberChuck O'Toole25-Apr-10 18:18 
I've been using this for years, just don't show it to your instructor.
 
// String match with wildcards.  Obtained from the Internet somewhere.  Case insensitive.

BOOL wm(const char *s, const char *t)
{
	return *t-'*' ? *s ? (*t=='?') | (toupper(*s)==toupper(*t)) && wm(s+1,t+1) : !*t : wm(s,t+1) || *s && wm(s+1,t);
}
 
If you want case sensitive, remove the toupper() calls.
AnswerMy C# contribution - recursive, of course!memberRenniePet26-Mar-10 5:21 
This strikes me as an obvious place to use recursion. So here goes...
 
   public class MString
   {
      /// <summary>
      /// Function to compare two strings, where strA may contain wildcard characters '*' and 
      /// '?'. http://en.wikipedia.org/wiki/Wildcard_character
      /// </summary>
      /// <param name="strA">string which may contain wildcards, may be empty, must not be null</param>
      /// <param name="strB">string to compare to, no wildcard processing, may be empty, must not be null</param>
      /// <param name="ignoreCase">true = ignore upper/lower case, false = don't ignore case</param>
      /// <returns>true = match, false = non-match</returns>
      public static bool CompareWWc(string strA, string strB, bool ignoreCase)
      {
         if (ignoreCase)
            return CompareWWc(strA.ToLower(), strB.ToLower());
         else 
            return CompareWWc(strA, strB);
      }
 

      /// <summary>
      /// Recursive function to compare two strings, where strA may contain wildcard characters 
      /// '*' and '?'. http://en.wikipedia.org/wiki/Wildcard_character
      /// </summary>
      /// <param name="strA">string which may contain wildcards, may be empty, must not be null</param>
      /// <param name="strB">string to compare to, no wildcard processing, may be empty, must not be null</param>
      /// <returns>true = match, false = non-match</returns>
      public static bool CompareWWc(string strA, string strB)
      {
         // Top of loop to scan across strA (and strB)
         for (int i = 0; i < strA.Length; i++)
         {
            // Special processing when we hit a '*' in strA
            if (strA[i] == '*')
            {
               // If the '*' is at the end of strA then result = true irrespective of strB
               if (i == strA.Length - 1)
                  return true;  
 
               // Do recursive calls to try to find a match somewhere to the right in strB
               strA = strA.Substring(i + 1);  // The part of strA beyond the '*'
               for (int j = i; j < strB.Length; j++)
                  if (CompareWWc(strA, strB.Substring(j)))
                     return true;
               return false;
            }
 
            // Normal processing for non-'*' characters in strA
            if (i >= strB.Length || (strA[i] != strB[i] && strA[i] != '?'))
               return false;
         }
 
         // We've reached the end of strA and the last character is not '*'
         return strA.Length == strB.Length;
      }
 
   }
 
And here's a little test sequence:
 
         if (!MString.CompareWWc("", ""))
            Console.WriteLine("Something wrong!");
 

         if (!MString.CompareWWc("something", "something"))
            Console.WriteLine("Something wrong!");
 
         if (MString.CompareWWc("something", "zomething"))
            Console.WriteLine("Something wrong!");
         
         if (MString.CompareWWc("something", "some"))
            Console.WriteLine("Something wrong!");
         
         if (MString.CompareWWc("something", "something else"))
            Console.WriteLine("Something wrong!");
 

         if (!MString.CompareWWc("s?m?th???", "something"))
            Console.WriteLine("Something wrong!");
         
         if (MString.CompareWWc("s?m?th???", "somethin"))
            Console.WriteLine("Something wrong!");
 

         if (!MString.CompareWWc("*", ""))
            Console.WriteLine("Something wrong!");
         
         if (!MString.CompareWWc("*", "nonsense"))
            Console.WriteLine("Something wrong!");
         
         if (!MString.CompareWWc("non*", "nonsense"))
            Console.WriteLine("Something wrong!");
 

         if (!MString.CompareWWc("*nonsense", "nonsense"))
            Console.WriteLine("Something wrong!");
 
         if (!MString.CompareWWc("non*nse", "nonsense"))
            Console.WriteLine("Something wrong!");
         
         if (MString.CompareWWc("non*nse", "nonsenze"))
            Console.WriteLine("Something wrong!");
         
         if (!MString.CompareWWc("non*n?e", "nonsense"))
            Console.WriteLine("Something wrong!");
 

         if (!MString.CompareWWc("n*on*nse", "nonsense"))
            Console.WriteLine("Something wrong!");
 
         if (!MString.CompareWWc("n*n*nse", "nonsense"))
            Console.WriteLine("Something wrong!");
 
         if (MString.CompareWWc("*non*nse", "nonsenze"))
            Console.WriteLine("Something wrong!");
 
         if (!MString.CompareWWc("n*n*n?e", "nonsense"))
            Console.WriteLine("Something wrong!");
      }
 
By the way, the name CompareWWc means Compare With Wildcards.
GeneralRe: My C# contribution - recursive, of course!memberErwin de GRoot29-Mar-10 1:58 
Actually, the recursive function together with substring will make this slow.
I'm using this at the moment:
    public static class StringExtensions
    {
        public static bool WildcardMatch(this string str, string compare, bool ignoreCase) 
        { 
            if (ignoreCase)
                return str.ToLower().WildcardMatch(compare.ToLower()); 
            else
                return str.WildcardMatch(compare); 
        }
 
        public static bool WildcardMatch(this string str, string compare)
        {
            if (string.IsNullOrEmpty(compare))
                return str.Length == 0;
            int pS = 0;
            int pW = 0;
            int lS = str.Length;
            int lW = compare.Length;
            
            while (pS < lS && pW < lW && compare[pW] != '*')
            {
                char wild = compare[pW];
                if (wild != '?' && wild != str[pS])
                    return false;
                pW++;
                pS++;
            }
 
            int pSm = 0;
            int pWm = 0;
            while (pS < lS && pW < lW)
            {
                char wild = compare[pW];
                if (wild == '*')
                {
                    pW++;
                    if (pW == lW)
                        return true;
                    pWm = pW;
                    pSm = pS + 1;
                }
                else if (wild == '?' || wild == str[pS])
                {
                    pW++;
                    pS++;
                }
                else
                {
                    pW = pWm;
                    pS = pSm;
                    pSm++;
                }
            }
            while (pW < lW && compare[pW] == '*')
                pW++;
            return pW == lW && pS == lS; 
        }
    }

GeneralDepends on whether you need to optimize the last few nanoseconds out of it...memberRenniePet29-Mar-10 7:45 
Hi Erwin,
 
Thanks for your posting. It did make me decide to investigate the situation.
 
I still really think this is a situation that begs for recursion. But maybe you were right that substring is not a good idea. So I made this version:
 
   public class MString2
   {
      /// <summary>
      /// Function to compare two strings, where strA may contain wildcard characters '*' and 
      /// '?'. http://en.wikipedia.org/wiki/Wildcard_character
      /// </summary>
      /// <param name="strA">string which may contain wildcards, may be empty, must not be null</param>
      /// <param name="strB">string to compare to, no wildcard processing, may be empty, must not be null</param>
      /// <param name="ignoreCase">true = ignore upper/lower case, false = don't ignore case</param>
      /// <returns>true = match, false = non-match</returns>
      public static bool CompareWWc(string strA, string strB, bool ignoreCase)
      {
         if (ignoreCase)
            return CompareWWc(strA.ToLower(), 0, strB.ToLower(), 0);
         else
            return CompareWWc(strA, 0, strB, 0);
      }
 

      /// <summary>
      /// Function to compare two strings, where strA may contain wildcard characters '*' and 
      /// '?'. http://en.wikipedia.org/wiki/Wildcard_character
      /// </summary>
      /// <param name="strA">string which may contain wildcards, may be empty, must not be null</param>
      /// <param name="strB">string to compare to, no wildcard processing, may be empty, must not be null</param>
      /// <returns>true = match, false = non-match</returns>
      public static bool CompareWWc(string strA, string strB)
      {
         // Just call the private recursive version of this function
         return CompareWWc(strA, 0, strB, 0);
      }
 

      /// <summary>
      /// Private recursive function used by the above two public functions.
      /// </summary>
      /// <param name="strA">string which may contain wildcards, may be empty, must not be null</param>
      /// <param name="indexA">index into strA marking start of the string for processing purposes</param>
      /// <param name="strB">string to compare to, no wildcard processing, may be empty, must not be null</param>
      /// <param name="indexB">index into strB marking start of the string for processing purposes</param>
      /// <returns>true = match, false = non-match</returns>
      private static bool CompareWWc(string strA, int indexA, string strB, int indexB)
      {
         // Top of loop to scan across strA (and strB)
         for (int i = 0; indexA + i < strA.Length; i++)
         {
            // Special processing when we hit a '*' in strA
            if (strA[indexA + i] == '*')
            {
               // If the '*' is at the end of strA then result = true irrespective of strB
               if (indexA + i == strA.Length - 1)
                  return true;
 
               // Do recursive calls to try to find a match somewhere to the right in strB
               for (int j = indexB + i; j < strB.Length; j++)
                  if (CompareWWc(strA, indexA + i + 1, strB, j))
                     return true;
               return false;
            }
 
            // Normal processing for non-'*' characters in strA
            if (indexB + i >= strB.Length || (strA[indexA + i] != strB[indexB + i] && strA[indexA + i] != '?'))
               return false;
         }
 
         // We've reached the end of strA and there is no '*' in strA
         return strA.Length - indexA == strB.Length - indexB;
      }
      
   }
 
Then I ran some timing tests, using System.Diagnostics.Stopwatch. I put my test case with 19 calls to the function in a loop and executed it 10,000 times. I did this for my original version, your version, and my new version. I compiled the programs in Release mode.
 
Assuming I haven't made a mistake somewhere, here are my results for a single function call:
 
My original version:  342 nonoseconds
Your version:         237 nanoseconds
My second version:    279 nanoseconds
Now to tell you the truth, I find it very difficult to get excited about saving 100 nanoseconds at the expense of having two and a half times as many lines of code. Especially since my expected use of this function in my application will probably never exceed a couple hundred calls per day. Smile | :)
 
Anyway, thanks for getting me to think things over again and make the tests. Personally, at least in this particular case, I prefer programmer understandability to execution efficiency. I've decided to stick with my original version, since I think my second version is more difficult to understand, and the improved efficiency not worth that disadvantage.
GeneralSorry - revised numbersmemberRenniePet29-Mar-10 8:35 
Hi Erwin,
 
Sorry - my previous numbers are not correct. I was running the programs under the Visual Studio debugger, and that was apparently not good for timing tests.
 
Here's what I get now:
 
My original version:  243 nonoseconds
Your version:          76 nanoseconds
My second version:    111 nanoseconds
Assuming these timings are valid, your version is three times faster than my original version, and that is pretty significant, at least in a situation were the function may be used millions times a day.
 
Sorry for the incorrect timings in my previous posting.
GeneralRe: Depends on whether you need to optimize the last few nanoseconds out of it...memberErwin de GRoot29-Mar-10 8:37 
Yes, the recursive function makes it more understandable for sure. In my case I actually call it several thousands of times after certain user actions, so I'm even considering using unsafe code Smile | :) I also thought of a special case where your function will get a performance hit: SearchString = "--ABC-----ABC-----ABC-----lots of text (without 'at') goes here", wildcardString = "*ABC*@". In this case my function (based on Jack's) will search for the '@' character once starting from position 5 (but won't find it, because it's not there). With your function it would search for the '@' character 3 times (once starting from position 5 until the end, once from 13 and once from 21). The longer the text at the end or the more occurances of 'ABC' at the start, the greater the performance hit.
GeneralYet another version - 25% faster, I think [modified]memberRenniePet1-Apr-10 8:24 
If at first you don't succeed...
 
Here's my third version, where I say to hell with minimizing lines of code and try to optimize the speed. No "unsafe" code though, unless you consider "goto" to be unsafe coding. Smile | :)
 
   public class MString
   {
      /// <summary>
      /// Compare two strings, where strA may contain wildcard characters '*' and '?'. 
      /// </summary>
      /// <param name="strA">string which may contain wildcards, may be empty, 
      ///                    must not be null</param>
      /// <param name="strB">string to compare to, no wildcard processing, may be empty, 
      ///                    must not be null</param>
      /// <param name="ignoreCase">true = ignore upper/lower case, false = observe case</param>
      /// <returns>true = match, false = non-match</returns>
      public static bool CompareWWc(string strA, string strB, bool ignoreCase)
      {
         if (ignoreCase)
            return CompareWWc(strA.ToLower(), strB.ToLower());
         else 
            return CompareWWc(strA, strB);
      }
 
      
      /// <summary>
      /// Compare two strings, where strA may contain wildcard characters '*' and '?'. 
      /// 
      /// In the comments, the word 'segment' is used to talk about the portions of strA that
      /// fall between two '*' characters, or between the start of the string and the first '*'
      /// or between the last '*' and the end of the string.
      /// </summary>
      /// <param name="strA">string which may contain wildcards, may be empty, 
      ///                    must not be null</param>
      /// <param name="strB">string to compare to, no wildcard processing, may be empty, 
      ///                    must not be null</param>
      /// <returns>true = match, false = non-match</returns>
      public static bool CompareWWc(string strA, string strB)
      {
         int starPtr = 0;  // Points at the '*' in strA

         // This part of the code handles the first segment in strA, or the case where strA
         //  does not contain any '*' character at all. The first segment is fairly simple to
         //  handle because it must match from the start of strB - no need to have a sliding 
         //  match loop.

         // Check strB long enough so we don't need to test for hitting its end while scanning
         if (strB.Length >= strA.Length)
         {
            // Simple optimized scan of first segment of strA and comparison with strB
            for (;; starPtr++)
            {
               if (starPtr == strA.Length)
                  return strA.Length == strB.Length;  // No '*' in strA and no mismatch
               if (strA[starPtr] == '*')
                  goto firstSegmentMatches;
               if (strA[starPtr] != strB[starPtr] && strA[starPtr] != '?')
                  return false;  // Mismatch
            }
         }
         else
         {
            // When strB is shorter than strA a match is not likely. But if strA contains 
            //  enough '*' characters it is possible, so we have to give it a try.
            for (;; starPtr++)
            {
               if (strA[starPtr] == '*')
                  goto firstSegmentMatches;
               if (starPtr == strB.Length)
                  return false;  // No '*' in strA before end of strB encountered
               if (strA[starPtr] != strB[starPtr] && strA[starPtr] != '?')
                  return false;  // Mismatch
            }
         }
 
         // The rest of the code handles the case where strA does contain one or more '*' 
         //  characters, and the first segment does match the start of strB.

      firstSegmentMatches:
 
         int indexA;  // Start of segment in strA
         int indexB = starPtr;  // Sliding match location in strB
         
         // Loop to process the segments in strA
         while (true)
         {
            // Test if next segment is last and empty
            indexA = ++starPtr;  // Point past '*'
            if (indexA == strA.Length)
               return true;  // Last segment empty - matches irrespective of strB content

            // Scan over the next segment in strA
            for (;; starPtr++)
               if (starPtr == strA.Length || strA[starPtr] == '*')
                  break;
 
            // Try to find match for this segment somewhere in strB
            for (;; indexB++)
            {
               if (starPtr - indexA > strB.Length - indexB)
                  return false;  // Mismatch if not enough characters left in strB

               for (int i = indexA, j = indexB; i < starPtr; i++, j++)
                  if (strA[i] != strB[j] && strA[i] != '?')
                     goto tryStringBAgain;
               
               goto findNextSegment;  // Match found for this segment in strB 

            tryStringBAgain:
               continue;
            }
 
            // Was that last segment? Return if so, loop if not.
         findNextSegment:
            indexB += starPtr - indexA;  // Point past matching portion of strB
            if (starPtr == strA.Length)
               return indexB == strB.Length;  // Return if that was last segment
         }
      }
 
   }
 
And here are my timing results (which I'm not totally sure of, I'm not used to timing code):
 
My original version:  243 nanoseconds    17 lines of code
Erwin's version:       76 nanoseconds    42 lines of code
My second version:    111 nanoseconds    16 lines of code
My third version:      56 nanoseconds    52 lines of code
 
I'd appreciate it if someone would check this out and let me know if they find any bugs or anything.
GeneralRe: Yet another version - 25% faster, I thinkmemberaleks1k21-Sep-11 2:47 
I found small bug, if compare "*a" and "babbba" function return false.
QuestionI used this function but I how I can catch variables from the * ???membermoh.hijjawi20-Oct-09 1:55 
Dear Jack,
Dear all,
 
I used this function in comparing two strings the first is Pattern(* KK *) and the second is Text(TT KK ZZ) and the function return pass. thats briliant,but my question how I can edit the function to be able to catch or handle the characters of matched * to save them in variables. for example:
 
X = TT
Y = ZZ
 
to deal with them later on in my system.
 
I tried many times but its not working well so far.
 
So please any one have an idea to do that please let me know and its will be appreciated.
 
Best Regards.
AnswerRe: I used this function but I how I can catch variables from the * ???memberRenniePet1-Apr-10 11:27 
It would be easiest if you use regular expressions instead of this function.
http://msdn.microsoft.com/en-us/library/system.text.regularexpressions.matchcollection.aspx[^]
Questionany updates ?memberalhambra-eidos2-Jul-09 5:12 
code in C# ??
 
AE

GeneralImproved matching with end-of-textmemberAnders Heie11-May-09 15:20 
Great code, but when trying this I realized that the following pattern is a match:
 
Search: ????????
Text to search: ABC
 
The problem is that the pattern can be LONGER than the text searched, in which case it should return a not found, but instead returns found.
 
Also, this example succeeds:
 
Search: y*n
Text to search: yessir
 
But of course should fail, since I'm looking for a text that ends with n
 
So I re-wrote your program to this, to correctly handle this situation.
 
bool StrWildCmp(char* wildstring, char *matchstring){
 
	
	char stopstring[1];
	*stopstring = 0;
 
	while(*matchstring) {
		if (*wildstring == '*') {
		  if (!*++wildstring) {
			return true;
		  } else {
			  *stopstring = *wildstring;
		  }
		}
 
		if(*stopstring) {
			if(*stopstring == *matchstring ) {
				wildstring++;
				matchstring++;
				*stopstring = 0;
			} else {
				matchstring++;
			}
		} else if((*wildstring == *matchstring) || (*wildstring == '?')) {
				wildstring++;
				matchstring++;
		} else {
			return false;
		}
 
		if(!*matchstring && *wildstring && *wildstring != '*') {
			// matchstring too short
			return false;
		}
	}
 
  return true;
}
 
Thanks again for the inspiration. Cool | :cool:
GeneralRe: Improved matching with end-of-text: some cases don't work properly!memberroadrunner31412-Aug-09 3:35 
some cases don't work properly:
 
wildstring = "a*bc"
matchstring = "abbc"
should be true, but it returns false
 
wildstring = "a*b"
matchstring = "a"
should be false, but it returns true
 
wildstring = "a*?b"
matchstring = "axb"
should be true, but it returns false
 
wildstring = "a**b"
matchstring = "axb"
should be true, but it returns false (ok, the two ** aren't useful, but they should work)
 
I solved the last 3 bugs, but the first one is a bit tricky...
bool StrWildCmp(char* wildstring, char *matchstring){
   char stopstring[1];
   *stopstring = '\0';
 
   while(*matchstring != '\0')
   {
      if (*wildstring == '*') 
      {
         do
         {         
            wildstring++;            
         } while (*wildstring == '*');  // if a dork entered two or more * in a row 
                                        // ignore them and go ahead
         
         if (*wildstring == '\0')   // if * was the last char, the strings are equal
         {
            return TRUE;
         }
         else
         {
            *stopstring = *wildstring; // the next char to check after the *
         }
      }
 
      if(*stopstring != '\0')
      {
         if((*stopstring == *matchstring) || (*stopstring == '?') ) 
         {
            wildstring++;
            *stopstring = '\0';
         }
         matchstring++;
      }
      else
         if((*wildstring == *matchstring) || (*wildstring == '?'))
         {
            wildstring++;
            matchstring++;
         }
         else
         {
            return FALSE;
         }
 
      if( (*matchstring == '\0') && (*wildstring != '\0') )
      {
         // matchstring seems to be too short. Check if wildstring has any more chars except '*'
         while (*wildstring == '*') // ignore following '*'
            wildstring++;
         
         if (*wildstring == '\0') // if wildstring endet after '*', strings are equal
            return TRUE;
         else
            return FALSE;
      }
}

QuestionPathMatchSpec instead?memberkintz25-Mar-09 8:55 
If you have ability to use Windows code you can use PathMatchSpec:
 
http://msdn.microsoft.com/en-us/library/bb773727(VS.85).aspx[^]
AnswerRe: PathMatchSpec instead?memberMandatoryDefault31-Aug-09 10:39 
I recommend against PathMatchSpec(). I used that function in my own code and it just bit me. Its wildcard behavior is broken for all but the simplest cases. For example, these two commands incorrectly return false:
 
::PathMatchSpec("C:\\Windows", "C:\\Windows.*");
 
::PathMatchSpec("C:\\Windows", "C:\\Windows.");
Questionwchar_t version?memberrmorales8729-Nov-08 20:16 
Anyone tried converting this to using wchar_t* (essentially Unicode) instead of char*?
AnswerRe: wchar_t version?memberrazvar31-Mar-11 21:49 
This is great and got my 5 because is simple, fast and useful!
 
Here is the wchar_t version:
 
int wildcmp(const wchar_t *wild, const wchar_t *string)
  {
  const wchar_t *cp = NULL, *mp = NULL;
 
  while ((*string) && (*wild != L'*')) {
    if ((towlower(*wild) != towlower(*string)) && (*wild != L'?')) {
      return 0;
    }
    wild++;
    string++;
  }
 
  while (*string) {
    if (*wild == L'*') {
      if (!*++wild) {
        return 1;
      }
      mp = wild;
      cp = string+1;
    } else if ((towlower(*wild) == towlower(*string)) || (*wild == L'?')) {
      wild++;
      string++;
    } else {
      wild = mp;
      string = cp++;
    }
  }
 
  while (*wild == L'*') {
    wild++;
  }
  return !*wild;
}
 
Example:
 
if (wildcmp(L"*bl?h.*", L"asblah.plm")) {
  //we have a match!
   MessageBox(0,"we have a match!","wildcmp wide",MB_TOPMOST);
} else {
  //no match =(
      MessageBox(0,"no match!","wildcmp wide",MB_TOPMOST);
}

Generalwildcmp in XBLitememberCodeGibbon27-Nov-08 13:56 
This is the version of the wildcmp function in XBLite programming language:
 
FUNCTION SBYTE wildcmp( wildcard$, search$)
  ' wildcmp(const char *wild, const char *string)
  ' Written by Jack Handy - jakkhandy@hotmail.com
 
  ULONG cp
  ULONG mp
  
  STRING s_txt$
  ULONG  sp 
  
  STRING w_txt$
  ULONG  wp
 
  IFZ search$   THEN RETURN $$FALSE  
  IFZ wildcard$ THEN RETURN $$FALSE
  
  w_txt$ = wildcard$ + "\0\0"   ' Just to be sure
  s_txt$ = search$   + "\0\0"
  
  DO WHILE (s_txt${sp}) && (w_txt${wp} != '*') 
    IF (w_txt${wp} != s_txt${sp} )  && (w_txt${wp} != '?') THEN RETURN $$FALSE
    
    INC wp
    INC sp
  LOOP
    
  DO WHILE (s_txt${sp})
    IF ( w_txt${wp} == '*' ) THEN    
      INC wp
      IF !(w_txt${wp}) THEN RETURN $$TRUE      
      
      mp = wp     
      cp = sp + 1 
    ELSE 
      IF (w_txt${wp} == s_txt${sp} )  || (w_txt${wp} == '?') THEN    
        INC wp
        INC sp
      ELSE
        wp = mp
      
        sp = cp                   
        IF s_txt${sp} THEN INC cp 
       
      ENDIF
    ENDIF  
  LOOP
    
  DO WHILE (w_txt${wp} == '*' )    
    INC wp
  LOOP
  
  RETURN !w_txt${wp} 
 
END FUNCTION
 

GeneralWildcard string compare in C#memberhaiquang10-Nov-08 22:15 
I had converted the wildcmp to C#, it's very easy to wildcard string, thanks so much.
 

bool WildCompare(string strWild, string strEmail)
{
int cp = 0;
int mp = 0;
 
int wildIndex = 0;
int emailIndex = 0;
 
while ((!ValueIsNullOrEmpty(strEmail, emailIndex)) && (ValueAt(strWild, wildIndex) != '*'))
{
if ((ValueAt(strWild, wildIndex) != ValueAt(strEmail, emailIndex)) && (ValueAt(strWild, wildIndex) != '?'))
{
return false;
}
wildIndex++;
emailIndex++;
}
 
while (!ValueIsNullOrEmpty(strEmail,emailIndex))
{
if (ValueAt( strWild, wildIndex) == '*')
{
wildIndex++;
if (ValueIsNullOrEmpty(strWild,wildIndex ))
{
return true;
}
mp = wildIndex;
cp = emailIndex + 1;
}
else if ((ValueAt(strWild, wildIndex).Equals(ValueAt(strEmail, emailIndex)) || (ValueAt(strWild, wildIndex) == '?')))
{
wildIndex++;
emailIndex++;
}
else
{
wildIndex = mp;
emailIndex = cp++;
}
}
 
while (ValueAt(strWild, wildIndex) == '*')
{
wildIndex++;
}
return ValueIsNullOrEmpty(strWild, wildIndex);
}
 

Cry | :(( Sniff | :^) Frown | :( Unsure | :~
GeneralRe: Wildcard string compare in C#memberhaiquang3-Aug-09 22:22 
is it good converted?
 
Take SharePoint to new height

GeneralC# Direct Portmemberhempels23-Sep-08 15:10 
Well, as direct as I could come up with anyway. Makes use of unsafe to enable pointer arithmetic. Unfortunately, because fixed is required to prevent the GC from moving the pointers, I had to change it to use increment indexers instead of directly manipulating the pointers. Alternatively, you could use stackalloc to instantiate two native char[]'s and copy the values, but that seems contrary to this function's low-memory footprint, high performance goals.
 
Has been tested against every test case presented in the comments section as well as some additional cases I threw in.
 
public unsafe static bool GlobCompare( string glob, string path )
{
      fixed ( char* pGlob = glob, pPath = path )
      {
            int pGlobInc = 0;
            int pPathInc = 0;
 
            int mp = 0;
            int cp = 0;
 
            while ( ( *( pPath + pPathInc ) != 0 ) && ( *( pGlob + pGlobInc ) != '*' ) )
            {
                  if ( ( *( pGlob + pGlobInc ) != *( pPath + pPathInc ) ) && ( *( pGlob + pGlobInc ) != '?' ) )
                  {
                        return false;
                  }
                  pGlobInc++;
                  pPathInc++;
            }
 
            while ( *( pPath + pPathInc ) != 0 )
            {
                  if ( *( pGlob + pGlobInc ) == '*' )
                  {
                        if ( 0 == *( pGlob + ++pGlobInc ) )
                        {
                              return true;
                        }
                        mp = pGlobInc;
                        cp = pPathInc + 1;
                  }
                  else if ( ( *( pGlob + pGlobInc ) == *( pPath + pPathInc ) ) || ( *( pGlob + pGlobInc ) == '?' ) )
                  {
                        pGlobInc++;
                        pPathInc++;
                  }
                  else
                  {
                        pGlobInc = mp;
                        pPathInc = cp++;
                  }
            }
 
            while ( *( pGlob + pGlobInc ) == '*' )
            {
                  pGlobInc++;
            }
            return ( 0 == *( pGlob + pGlobInc ) );
      }
}
General...and yet another C# port [modified]memberDVF27-Aug-10 16:59 
public static bool WildcardMatch(string strCompare, string strWild, bool bIgnoreCase)
{
    if (bIgnoreCase)
    {
        strWild = strWild.ToUpper();
        strCompare = strCompare.ToUpper();
    }
 
    // Lengths of strings
    int iWildLen = strWild.Length;
    int iCompareLen = strCompare.Length;
 
    // Used to save position when '*' found in strWild
    // Initialized to invalid values
    int iWildMatched = iWildLen;
    int iCompareBase = iCompareLen;
 
    int iWild = 0;
    int iCompare = 0;
 
    // Match until first wildcard '*'
    while (iCompare < iCompareLen && (iWild >= iWildLen || strWild[iWild] != '*'))
    {
        if (iWild >= iWildLen || (strWild[iWild] != strCompare[iCompare] && strWild[iWild] != '?'))
            return false;
 
        iWild++;
        iCompare++;
    }
 
    // Process wildcard
    while (iCompare < iCompareLen)
    {
        if (iWild < iWildLen)
        {
            if (strWild[iWild] == '*')
            {
                iWild++;
 
                if (iWild == iWildLen)
                    return true;
 
                iWildMatched = iWild;
                iCompareBase = iCompare + 1;
 
                continue;
            }
 
            if (strWild[iWild] == strCompare[iCompare] || strWild[iWild] == '?')
            {
                iWild++;
                iCompare++;
 
                continue;
            }
        }
 
        iWild = iWildMatched;
        iCompare = iCompareBase++;
    }
 
    while (iWild < iWildLen && strWild[iWild] == '*')
        iWild++;
 
    if (iWild < iWildLen)
        return false;
 
    return true;
}

modified on Saturday, August 28, 2010 10:10 PM

GeneralRe: ...and yet another C# portmemberVUnreal21-Sep-10 11:22 
Works quite well.
General[Message Removed]memberstonber18-Sep-08 14:22 
Spam message removed
GeneralUsing in Artistic Stylememberjimp023-Apr-08 4:43 
I am using this in Artistic Style, a popular multi-platform code formatter available at SourceForge.
 
http://astyle.sourceforge.net/
 
Release 1.22 added directory recursion to the project. Wildcard processing was made internal to the program. Linux has a glob function but Windows doesn't. I just used this for both of them. It let me process both platforms in a similar manner.
 
A minor change was made for Windows to make the comparison case insensitive. Linux was left case sensitive.
 
Thanks for making it available. Using this was a lot easier than writing my own. I doubt that mine would have been this sophisticated.
GeneralGeez...memberlarryfr5-Mar-08 9:39 
D'Oh! | :doh: Boy do I feel stupid. I worked on an algorithm like this for days, and never got it quite right. Then, I see the wonderful, and simplistic work of someone like this, and it reminds me that sometimes we all are guilty of 'over-engineering'...
 
Thanks Mr. Handy!
QuestionConvert to a replace?memberwilliaps20-Mar-07 8:31 
How can this code be converted to do a replace? I need to provide a find/replace dialog in an application and I don't want to jump through the hoops of the Boost library. Can anyone help?
 
Patrick
GeneralC# RexExp versionmemberspinsane4-Nov-06 6:30 
Here's RegExp version (may be easily ported to C++).
Pros: More readable, Relies on proven RegExp
Cons: Maybe slower?, If eval string contains RegExp keywords then it might result in unexpected result
 

public static bool Match(string eval, string pattern, bool caseSensitive)
{
bool match = false;
 
// Make input parameters lower-case if case is not an issue
if (!caseSensitive)
{
eval = eval.ToLower();
pattern = pattern.ToLower();
}
 
// Escape regexp special character in pattern
pattern = pattern.Replace(".", @"\.");
 
// Replace valid wildcards with regexp equivalents
pattern = pattern.Replace('?', '.').Replace("*", ".*");
 
// Add boundaries to pattern
pattern = @"\A" + pattern + @"\z";
 
// Search for a match
try
{
match = Regex.IsMatch(eval, pattern);
}
catch /* (ArgumentException ex) */
{
// Syntax error in the regular expression
}
 
// Return result
return match;
}

GeneralKudosmemberquantumred14-Oct-06 4:37 
This is tight and clever. Thanks for sharing it.
GeneralRe: Kudosmembermilkplus24-Feb-10 11:19 
I agree. This is excellent.
Generalwildcmp(&quot;*&amp;lt;*&amp;gt;&quot;, &quot;&amp;lt;field1&amp;gt;&amp;lt;field2&amp;gt;&quot;) not working [modified]memberDaniel B.6-Sep-06 13:14 
Hi,
 
wildcmp("*<*>", "<field1><field2>") return 1 while I think it should return 0 (I maybe wrong, so please tell me).
 
If someone knows how to fix it, I will appreciate.
 
Regards

GeneralRe: wildcmp(&quot;*&amp;lt;*&amp;gt;&quot;, &quot;&amp;lt;field1&amp;gt;&amp;lt;field2&amp;gt;&quot;) not workingmemberradboudp16-Feb-07 0:35 
Sure it matches. The first '*' matches ''. '<*>' matches ''
 
Regards,
Radboud
Generalreturn value typememberwdx048-Jan-06 15:49 
I think it's better to make the function return a bool value. Anyway, many string comparision functions return 0 when the strings equal.
General*? case matchmembertalimu3-Nov-05 23:42 
if wild = "*?.abc", str = "abc.abc"
wildcmp(wild, str) not work
 
but if wild = "?*.abc", str = "abc.abc"
wildcmp(wild, str) do work
 
does anyone have any idea about the case?
GeneralRe: *? case matchmemberkuhnm15-Sep-06 2:18 
Having similar problems with "*Hallo 200? ueberalles*.ddd".
It doesn´t work. I think, when the first * is finished, it does not expect an other wildcard in the pattern to follow.
GeneralRe: *? case matchmemberkuhnm18-Sep-06 4:48 
Ignore my last email,
like usually the problem sits in front of the screen.
(I mixed a project built with multibyte Chars with this code which was only chars. And of course I used a Umlaut instead of 'ue' in my tests. So no wonder, why it crashed after the '?' )
I´m very sorry!
GeneralGets my 5memberFranc Morales18-Oct-05 17:05 
Simple, fast, useful, AND fun to figure out.
 
Well done.
Generalmp and cpmembertwopieman15-Mar-05 11:59 
i got the overall flow of the program I didnt get the logic of the second loop completely. I understand that in the second loop it checks if there is nothing after * if so then it is a match but if there is something it stores them in the two pointers and then goes on.
also in the final else it goes like else
{
wild = mp;
string = cp++;
}
am sorry but am not getting the logic totally.
can someone please explain?
GeneralRe: mp and cpmemberradboudp16-Feb-07 1:14 
In case you are matching something like the following:
 
"*.abc" to "ab.de.abc"
 
In the second loop it looks for the first character after the asterisk that is the same in the string. At first it matches "*" against "ab". mp = ".abc" during this. Now wild = ".abc" and string = ".de.abc". Obvious no match. On the next loop the first characters do match (both '.') and wild becomes "abc" and string "de.abc". The next loop there is no match and it falls to the else. Here it resets wild to the last mp (mask pattern??) and string to the last cp (character pattern) WITHOUT THE FIRST CHARACTER. (It actually advances cp one position.)
 
Why does it do this. After matching the * against part of the string and encountering a possible poisiton where to match the remainder of the pattern, it continued comparing characters from both to each other. This fialed. Since right before the position of mp there was a *, it is still allowed to add characters to the part that is matched against that. Basically, it goes back to that position but decides that the character that occurs in both strings is not the next character in the pattern but part of the '*' wildcard.
 
In the end it has matched '*' with 'ab.de'.
GeneralOK, but ...memberSam Levy16-Feb-05 4:48 
what was changed?
QuestionWhy make 3 loop ?memberDarkYoda Mickael2-Feb-05 22:22 
Hello,
 
i think this post is very interesting because is very simple and make very cool work !
 
BUT !
 
I don't understand why you make 3 loop to do it ?
 
I think i don't see all case, because for me only the 2 loop make all the work ?
 
I'm trying to understand all the process to add optionnal char with the ^ escape sequence, for exemple : ^-* match -12 or 12 Wink | ;)
 
Thanks
AnswerRe: Why make 3 loop ?memberJack Handy13-Feb-05 10:02 
DarkYoda Mickael wrote:
I don't understand why you make 3 loop to do it ?
 
I think i don't see all case, because for me only the 2 loop make all the work ?

 
The third loop:
 
while (*wild == '*') {
    wild++;
}

 
is there to take care of trailing *'s. Since * means 0 or more chars, "test*" should match "test" just fine. That loop takes care of this case.
 
-Jack
 

There are 10 types of people in this world, those that understand binary and those who don't.


GeneralC# versionmemberSancy26-Oct-04 6:23 
Hi, i have a stupid question, could someone give me the c# version Smile | :)
thanks in advance
GeneralRe: C# versionsussPsyk6621-Dec-04 3:39 
	private bool wildcmp(string wild, string str) 
	{
		int cp=0, mp=0;
	
		int i=0;
		int j=0;
		while ((i<str.Length) && (wild[j] != '*')) 
		{
			if ((wild[j] != str[i]) && (wild[j] != '?')) 
			{
				return false;
			}
			i++;
			j++;
		}
		
		while (i<str.Length) 
		{
			if (j<wild.Length && wild[j] == '*') 
			{
				if ((j++)>=wild.Length) 
				{
					return true;
				}
				mp = j;
				cp = i+1;
			} 
			else if (j<wild.Length && (wild[j] == str[i] || wild[j] == '?')) 
			{
				j++;
				i++;
			} 
			else 
			{
				j = mp;
				i = cp++;
			}
		}
		
		while (j<wild.Length && wild[j] == '*') 
		{
			j++;
		}
		return j>=wild.Length;
	}
 
This C# version works. I'm sure there are loads of improvements to be made though. Don't flame me for such bad code, I only started C# yesterday;)

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web04 | 2.6.130617.1 | Last Updated 15 Feb 2005
Article Copyright 2001 by Jack Handy
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid