Click here to Skip to main content
Click here to Skip to main content

Wildcard string compare (globbing)

By , 15 Feb 2005
 

Usage:

This is a fast, lightweight, and simple pattern matching function.

if (wildcmp("bl?h.*", "blah.jpg")) {
  //we have a match!
} else {
  //no match =(
}

Function:

int wildcmp(const char *wild, const char *string) {
  // Written by Jack Handy - <A href="mailto:jakkhandy@hotmail.com">jakkhandy@hotmail.com</A>
  const char *cp = NULL, *mp = NULL;

  while ((*string) && (*wild != '*')) {
    if ((*wild != *string) && (*wild != '?')) {
      return 0;
    }
    wild++;
    string++;
  }

  while (*string) {
    if (*wild == '*') {
      if (!*++wild) {
        return 1;
      }
      mp = wild;
      cp = string+1;
    } else if ((*wild == *string) || (*wild == '?')) {
      wild++;
      string++;
    } else {
      wild = mp;
      string = cp++;
    }
  }

  while (*wild == '*') {
    wild++;
  }
  return !*wild;
}

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

About the Author

Jack Handy
Web Developer
United States United States
No Biography provided

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
QuestionI used this function but I how I can catch variables from the * ??? Pinmembermoh.hijjawi20-Oct-09 1:55 
Dear Jack,
Dear all,
 
I used this function in comparing two strings the first is Pattern(* KK *) and the second is Text(TT KK ZZ) and the function return pass. thats briliant,but my question how I can edit the function to be able to catch or handle the characters of matched * to save them in variables. for example:
 
X = TT
Y = ZZ
 
to deal with them later on in my system.
 
I tried many times but its not working well so far.
 
So please any one have an idea to do that please let me know and its will be appreciated.
 
Best Regards.
AnswerRe: I used this function but I how I can catch variables from the * ??? PinmemberRenniePet1-Apr-10 11:27 
Questionany updates ? Pinmemberalhambra-eidos2-Jul-09 5:12 
code in C# ??
 
AE

GeneralImproved matching with end-of-text PinmemberAnders Heie11-May-09 15:20 
Great code, but when trying this I realized that the following pattern is a match:
 
Search: ????????
Text to search: ABC
 
The problem is that the pattern can be LONGER than the text searched, in which case it should return a not found, but instead returns found.
 
Also, this example succeeds:
 
Search: y*n
Text to search: yessir
 
But of course should fail, since I'm looking for a text that ends with n
 
So I re-wrote your program to this, to correctly handle this situation.
 
bool StrWildCmp(char* wildstring, char *matchstring){
 
	
	char stopstring[1];
	*stopstring = 0;
 
	while(*matchstring) {
		if (*wildstring == '*') {
		  if (!*++wildstring) {
			return true;
		  } else {
			  *stopstring = *wildstring;
		  }
		}
 
		if(*stopstring) {
			if(*stopstring == *matchstring ) {
				wildstring++;
				matchstring++;
				*stopstring = 0;
			} else {
				matchstring++;
			}
		} else if((*wildstring == *matchstring) || (*wildstring == '?')) {
				wildstring++;
				matchstring++;
		} else {
			return false;
		}
 
		if(!*matchstring && *wildstring && *wildstring != '*') {
			// matchstring too short
			return false;
		}
	}
 
  return true;
}
 
Thanks again for the inspiration. Cool | :cool:
GeneralRe: Improved matching with end-of-text: some cases don't work properly! Pinmemberroadrunner31412-Aug-09 3:35 
QuestionPathMatchSpec instead? Pinmemberkintz25-Mar-09 8:55 
If you have ability to use Windows code you can use PathMatchSpec:
 
http://msdn.microsoft.com/en-us/library/bb773727(VS.85).aspx[^]
AnswerRe: PathMatchSpec instead? PinmemberMandatoryDefault31-Aug-09 10:39 
Questionwchar_t version? Pinmemberrmorales8729-Nov-08 20:16 
Anyone tried converting this to using wchar_t* (essentially Unicode) instead of char*?
AnswerRe: wchar_t version? Pinmemberrazvar31-Mar-11 21:49 
Generalwildcmp in XBLite PinmemberCodeGibbon27-Nov-08 13:56 
This is the version of the wildcmp function in XBLite programming language:
 
FUNCTION SBYTE wildcmp( wildcard$, search$)
  ' wildcmp(const char *wild, const char *string)
  ' Written by Jack Handy - jakkhandy@hotmail.com
 
  ULONG cp
  ULONG mp
  
  STRING s_txt$
  ULONG  sp 
  
  STRING w_txt$
  ULONG  wp
 
  IFZ search$   THEN RETURN $$FALSE  
  IFZ wildcard$ THEN RETURN $$FALSE
  
  w_txt$ = wildcard$ + "\0\0"   ' Just to be sure
  s_txt$ = search$   + "\0\0"
  
  DO WHILE (s_txt${sp}) && (w_txt${wp} != '*') 
    IF (w_txt${wp} != s_txt${sp} )  && (w_txt${wp} != '?') THEN RETURN $$FALSE
    
    INC wp
    INC sp
  LOOP
    
  DO WHILE (s_txt${sp})
    IF ( w_txt${wp} == '*' ) THEN    
      INC wp
      IF !(w_txt${wp}) THEN RETURN $$TRUE      
      
      mp = wp     
      cp = sp + 1 
    ELSE 
      IF (w_txt${wp} == s_txt${sp} )  || (w_txt${wp} == '?') THEN    
        INC wp
        INC sp
      ELSE
        wp = mp
      
        sp = cp                   
        IF s_txt${sp} THEN INC cp 
       
      ENDIF
    ENDIF  
  LOOP
    
  DO WHILE (w_txt${wp} == '*' )    
    INC wp
  LOOP
  
  RETURN !w_txt${wp} 
 
END FUNCTION
 

GeneralWildcard string compare in C# Pinmemberhaiquang10-Nov-08 22:15 
I had converted the wildcmp to C#, it's very easy to wildcard string, thanks so much.
 

bool WildCompare(string strWild, string strEmail)
{
int cp = 0;
int mp = 0;
 
int wildIndex = 0;
int emailIndex = 0;
 
while ((!ValueIsNullOrEmpty(strEmail, emailIndex)) && (ValueAt(strWild, wildIndex) != '*'))
{
if ((ValueAt(strWild, wildIndex) != ValueAt(strEmail, emailIndex)) && (ValueAt(strWild, wildIndex) != '?'))
{
return false;
}
wildIndex++;
emailIndex++;
}
 
while (!ValueIsNullOrEmpty(strEmail,emailIndex))
{
if (ValueAt( strWild, wildIndex) == '*')
{
wildIndex++;
if (ValueIsNullOrEmpty(strWild,wildIndex ))
{
return true;
}
mp = wildIndex;
cp = emailIndex + 1;
}
else if ((ValueAt(strWild, wildIndex).Equals(ValueAt(strEmail, emailIndex)) || (ValueAt(strWild, wildIndex) == '?')))
{
wildIndex++;
emailIndex++;
}
else
{
wildIndex = mp;
emailIndex = cp++;
}
}
 
while (ValueAt(strWild, wildIndex) == '*')
{
wildIndex++;
}
return ValueIsNullOrEmpty(strWild, wildIndex);
}
 

Cry | :(( Sniff | :^) Frown | :( Unsure | :~
GeneralRe: Wildcard string compare in C# Pinmemberhaiquang3-Aug-09 22:22 
GeneralC# Direct Port Pinmemberhempels23-Sep-08 15:10 
Well, as direct as I could come up with anyway. Makes use of unsafe to enable pointer arithmetic. Unfortunately, because fixed is required to prevent the GC from moving the pointers, I had to change it to use increment indexers instead of directly manipulating the pointers. Alternatively, you could use stackalloc to instantiate two native char[]'s and copy the values, but that seems contrary to this function's low-memory footprint, high performance goals.
 
Has been tested against every test case presented in the comments section as well as some additional cases I threw in.
 
public unsafe static bool GlobCompare( string glob, string path )
{
      fixed ( char* pGlob = glob, pPath = path )
      {
            int pGlobInc = 0;
            int pPathInc = 0;
 
            int mp = 0;
            int cp = 0;
 
            while ( ( *( pPath + pPathInc ) != 0 ) && ( *( pGlob + pGlobInc ) != '*' ) )
            {
                  if ( ( *( pGlob + pGlobInc ) != *( pPath + pPathInc ) ) && ( *( pGlob + pGlobInc ) != '?' ) )
                  {
                        return false;
                  }
                  pGlobInc++;
                  pPathInc++;
            }
 
            while ( *( pPath + pPathInc ) != 0 )
            {
                  if ( *( pGlob + pGlobInc ) == '*' )
                  {
                        if ( 0 == *( pGlob + ++pGlobInc ) )
                        {
                              return true;
                        }
                        mp = pGlobInc;
                        cp = pPathInc + 1;
                  }
                  else if ( ( *( pGlob + pGlobInc ) == *( pPath + pPathInc ) ) || ( *( pGlob + pGlobInc ) == '?' ) )
                  {
                        pGlobInc++;
                        pPathInc++;
                  }
                  else
                  {
                        pGlobInc = mp;
                        pPathInc = cp++;
                  }
            }
 
            while ( *( pGlob + pGlobInc ) == '*' )
            {
                  pGlobInc++;
            }
            return ( 0 == *( pGlob + pGlobInc ) );
      }
}
General...and yet another C# port [modified] PinmemberDVF27-Aug-10 16:59 
public static bool WildcardMatch(string strCompare, string strWild, bool bIgnoreCase)
{
    if (bIgnoreCase)
    {
        strWild = strWild.ToUpper();
        strCompare = strCompare.ToUpper();
    }
 
    // Lengths of strings
    int iWildLen = strWild.Length;
    int iCompareLen = strCompare.Length;
 
    // Used to save position when '*' found in strWild
    // Initialized to invalid values
    int iWildMatched = iWildLen;
    int iCompareBase = iCompareLen;
 
    int iWild = 0;
    int iCompare = 0;
 
    // Match until first wildcard '*'
    while (iCompare < iCompareLen && (iWild >= iWildLen || strWild[iWild] != '*'))
    {
        if (iWild >= iWildLen || (strWild[iWild] != strCompare[iCompare] && strWild[iWild] != '?'))
            return false;
 
        iWild++;
        iCompare++;
    }
 
    // Process wildcard
    while (iCompare < iCompareLen)
    {
        if (iWild < iWildLen)
        {
            if (strWild[iWild] == '*')
            {
                iWild++;
 
                if (iWild == iWildLen)
                    return true;
 
                iWildMatched = iWild;
                iCompareBase = iCompare + 1;
 
                continue;
            }
 
            if (strWild[iWild] == strCompare[iCompare] || strWild[iWild] == '?')
            {
                iWild++;
                iCompare++;
 
                continue;
            }
        }
 
        iWild = iWildMatched;
        iCompare = iCompareBase++;
    }
 
    while (iWild < iWildLen && strWild[iWild] == '*')
        iWild++;
 
    if (iWild < iWildLen)
        return false;
 
    return true;
}

modified on Saturday, August 28, 2010 10:10 PM

GeneralRe: ...and yet another C# port PinmemberVUnreal21-Sep-10 11:22 
General[Message Removed] Pinmemberstonber18-Sep-08 14:22 
Spam message removed
GeneralUsing in Artistic Style Pinmemberjimp023-Apr-08 4:43 
I am using this in Artistic Style, a popular multi-platform code formatter available at SourceForge.
 
http://astyle.sourceforge.net/
 
Release 1.22 added directory recursion to the project. Wildcard processing was made internal to the program. Linux has a glob function but Windows doesn't. I just used this for both of them. It let me process both platforms in a similar manner.
 
A minor change was made for Windows to make the comparison case insensitive. Linux was left case sensitive.
 
Thanks for making it available. Using this was a lot easier than writing my own. I doubt that mine would have been this sophisticated.
GeneralGeez... Pinmemberlarryfr5-Mar-08 9:39 
D'Oh! | :doh: Boy do I feel stupid. I worked on an algorithm like this for days, and never got it quite right. Then, I see the wonderful, and simplistic work of someone like this, and it reminds me that sometimes we all are guilty of 'over-engineering'...
 
Thanks Mr. Handy!
QuestionConvert to a replace? Pinmemberwilliaps20-Mar-07 8:31 
How can this code be converted to do a replace? I need to provide a find/replace dialog in an application and I don't want to jump through the hoops of the Boost library. Can anyone help?
 
Patrick
GeneralC# RexExp version Pinmemberspinsane4-Nov-06 6:30 
Here's RegExp version (may be easily ported to C++).
Pros: More readable, Relies on proven RegExp
Cons: Maybe slower?, If eval string contains RegExp keywords then it might result in unexpected result
 

public static bool Match(string eval, string pattern, bool caseSensitive)
{
bool match = false;
 
// Make input parameters lower-case if case is not an issue
if (!caseSensitive)
{
eval = eval.ToLower();
pattern = pattern.ToLower();
}
 
// Escape regexp special character in pattern
pattern = pattern.Replace(".", @"\.");
 
// Replace valid wildcards with regexp equivalents
pattern = pattern.Replace('?', '.').Replace("*", ".*");
 
// Add boundaries to pattern
pattern = @"\A" + pattern + @"\z";
 
// Search for a match
try
{
match = Regex.IsMatch(eval, pattern);
}
catch /* (ArgumentException ex) */
{
// Syntax error in the regular expression
}
 
// Return result
return match;
}

GeneralKudos Pinmemberquantumred14-Oct-06 4:37 
This is tight and clever. Thanks for sharing it.
GeneralRe: Kudos Pinmembermilkplus24-Feb-10 11:19 
Generalwildcmp(&quot;*&amp;lt;*&amp;gt;&quot;, &quot;&amp;lt;field1&amp;gt;&amp;lt;field2&amp;gt;&quot;) not working [modified] PinmemberDaniel B.6-Sep-06 13:14 
Hi,
 
wildcmp("*<*>", "<field1><field2>") return 1 while I think it should return 0 (I maybe wrong, so please tell me).
 
If someone knows how to fix it, I will appreciate.
 
Regards

GeneralRe: wildcmp(&quot;*&amp;lt;*&amp;gt;&quot;, &quot;&amp;lt;field1&amp;gt;&amp;lt;field2&amp;gt;&quot;) not working Pinmemberradboudp16-Feb-07 0:35 
Generalreturn value type Pinmemberwdx048-Jan-06 15:49 
I think it's better to make the function return a bool value. Anyway, many string comparision functions return 0 when the strings equal.
General*? case match Pinmembertalimu3-Nov-05 23:42 
if wild = "*?.abc", str = "abc.abc"
wildcmp(wild, str) not work
 
but if wild = "?*.abc", str = "abc.abc"
wildcmp(wild, str) do work
 
does anyone have any idea about the case?
GeneralRe: *? case match Pinmemberkuhnm15-Sep-06 2:18 
GeneralRe: *? case match Pinmemberkuhnm18-Sep-06 4:48 
GeneralGets my 5 PinmemberFranc Morales18-Oct-05 17:05 
Simple, fast, useful, AND fun to figure out.
 
Well done.
Generalmp and cp Pinmembertwopieman15-Mar-05 11:59 
i got the overall flow of the program I didnt get the logic of the second loop completely. I understand that in the second loop it checks if there is nothing after * if so then it is a match but if there is something it stores them in the two pointers and then goes on.
also in the final else it goes like else
{
wild = mp;
string = cp++;
}
am sorry but am not getting the logic totally.
can someone please explain?
GeneralRe: mp and cp Pinmemberradboudp16-Feb-07 1:14 
GeneralOK, but ... PinmemberSam Levy16-Feb-05 4:48 
what was changed?
QuestionWhy make 3 loop ? PinmemberDarkYoda Mickael2-Feb-05 22:22 
Hello,
 
i think this post is very interesting because is very simple and make very cool work !
 
BUT !
 
I don't understand why you make 3 loop to do it ?
 
I think i don't see all case, because for me only the 2 loop make all the work ?
 
I'm trying to understand all the process to add optionnal char with the ^ escape sequence, for exemple : ^-* match -12 or 12 Wink | ;)
 
Thanks
AnswerRe: Why make 3 loop ? PinmemberJack Handy13-Feb-05 10:02 
GeneralC# version PinmemberSancy26-Oct-04 6:23 
Hi, i have a stupid question, could someone give me the c# version Smile | :)
thanks in advance
GeneralRe: C# version PinsussPsyk6621-Dec-04 3:39 
GeneralRe: C# version PinmemberIonut FIlip22-Feb-05 6:15 
GeneralRe: C# version Pinmemberrobagar3-Apr-06 16:58 
GeneralRe: C# version PinmemberSancy5-Jun-06 16:01 
GeneralConvert to java base on C# version [modified, better look :~ ] Pinmemberquangtin321-Mar-08 21:13 
GeneralRe: C# version - an error! PinmemberMark T.4-Jul-08 14:37 
GeneralRe: C# version Pinmemberwilliamhix17-Oct-08 22:28 
GeneralMany thanks, with 1 small gripe .. PinmemberDavid Patrick29-Sep-04 8:41 

most C compare functions return zero when the values are equal, but this function returns non-zero.
 
Personally, I find the non-zero to be more intuitive .. but after years of forcing myself to check for zero I find it a bit counter-intuitive.
 
I think I'll just rename the function when I add it to my library Smile | :)
 
But that certainly wont stop me from using this wonderful routine.
 
Many sincere thanks ...
 

GeneralRe: Many thanks, with 1 small gripe .. PinmemberJack Handy6-Oct-04 8:13 
GeneralRe: Many thanks, with 1 small gripe .. PinmemberVic Mackey16-Oct-04 19:33 
GeneralRe: Many thanks, with 1 small gripe .. PinmemberVoja Intermajstor24-Nov-04 23:26 
GeneralNice code... Pinmembervoja2125-Aug-04 2:30 

This is realy nice & and useful code. I used to write something similar, but your example is simplier and shorter.
Because it lacks comments, I spent some time to understand (before I saw comment form Targys Hmmm | :| - real tutorial Wink | ;) ) and it is clear now. Thanks to both of you!
 
To 'wise' guys, flamers, and other people who has nothing to do instead of arguing:
- If the code has a bug, report but don't pretend you are a genius or a guru. If you can do it better, submit an article.
If you don't like the code, don't use it!
 
And about NULL pointers:
Idiot-proofing should be implemented at the level where data (function arguments) is acquired and prepared, not in such low-level function.
Besides that, I tested several functions from string.h with NULL parameters and every single one threw an exception. No further comments...
 
Regards, Voja
GeneralSlight efficiency improvement PinmemberBill Buklis9-Jul-04 6:53 

Great piece of code, but I have one minor improvement. It appears to me that the variable "cp" doesn't do anything and servers no purpose.
 
If I'm correct, then you can safely remove the line:
cp = string+1;
 
and also remove:
string = cp;
 
and replace:
cp = string++;
 
with:
++string;
 

I'm believe the results would be identical.

GeneralRe: Slight efficiency improvement PinmemberBill Buklis9-Jul-04 7:19 
QuestionPathMatchSpec (shlwapi.h)? Pinmemberpeterchen28-Jun-04 6:56 
Just a thought:
the PathMatchSpec SLWU API could provide similar. I guess it does have some differences (e.g. allowing to specify multiple specs, separated by semicolon), but it might be a simple alternative for many similar tasks.
 

 

we are here to help each other get through this thing, whatever it is Vonnegut jr.

sighist || Agile Programming | doxygen

Permalink | Advertise | Privacy | Mobile
Web02 | 2.6.130617.1 | Last Updated 15 Feb 2005
Article Copyright 2001 by Jack Handy
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid