Click here to Skip to main content
15,886,026 members
Articles / Programming Languages / C++
Article

Wildcard string compare (globbing)

Rate me:
Please Sign up or sign in to vote.
4.90/5 (82 votes)
15 Feb 2005 1.2M   96   144
Matches a string against a wildcard string such as "*.*" or "bl?h.*" etc. This is good for file globbing or to match hostmasks.

Usage:

This is a fast, lightweight, and simple pattern matching function.

if (wildcmp("bl?h.*", "blah.jpg")) {
  //we have a match!
} else {
  //no match =(
}

Function:

int wildcmp(const char *wild, const char *string) {
  // Written by Jack Handy - <A href="mailto:jakkhandy@hotmail.com">jakkhandy@hotmail.com</A>
  const char *cp = NULL, *mp = NULL;

  while ((*string) && (*wild != '*')) {
    if ((*wild != *string) && (*wild != '?')) {
      return 0;
    }
    wild++;
    string++;
  }

  while (*string) {
    if (*wild == '*') {
      if (!*++wild) {
        return 1;
      }
      mp = wild;
      cp = string+1;
    } else if ((*wild == *string) || (*wild == '?')) {
      wild++;
      string++;
    } else {
      wild = mp;
      string = cp++;
    }
  }

  while (*wild == '*') {
    wild++;
  }
  return !*wild;
}

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here


Written By
Web Developer
United States United States
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
GeneralRe: checking for null Pin
Jack Handy18-Mar-03 13:25
Jack Handy18-Mar-03 13:25 
Generalgreat code, but ... Pin
Anonymous12-Mar-03 3:17
Anonymous12-Mar-03 3:17 
GeneralRe: great code, but ... Pin
Jack Handy13-Mar-03 20:34
Jack Handy13-Mar-03 20:34 
GeneralRe: great code, but ... Pin
Anonymous15-Mar-03 20:08
Anonymous15-Mar-03 20:08 
GeneralRe: great code, but ... Pin
Jack Handy18-Mar-03 13:25
Jack Handy18-Mar-03 13:25 
GeneralRe: great code, but ... Pin
Anonymous19-Mar-03 23:13
Anonymous19-Mar-03 23:13 
GeneralNice Pin
Chris Richardson16-Jan-03 10:58
Chris Richardson16-Jan-03 10:58 
GeneralRe: Nice Pin
Jack Handy20-Jan-03 11:56
Jack Handy20-Jan-03 11:56 
GeneralCool Code Pin
Anonymous6-Dec-02 11:09
Anonymous6-Dec-02 11:09 
GeneralRe: Cool Code Pin
hector santos1-Mar-03 1:53
hector santos1-Mar-03 1:53 
GeneralRe: Cool Code Pin
Kent C. Dorner2-Mar-03 5:32
sussKent C. Dorner2-Mar-03 5:32 
GeneralRe: Cool Code Pin
gebrudergrimm28-Dec-03 10:18
gebrudergrimm28-Dec-03 10:18 
GeneralRe: Cool Code Pin
BrcKcc15-Jun-04 8:18
BrcKcc15-Jun-04 8:18 
GeneralYes man - you a really cool developer Pin
Stanislav Panasik23-Oct-02 19:06
Stanislav Panasik23-Oct-02 19:06 
GeneralRe: Yes man - you a really cool developer Pin
Anonymous28-Mar-04 13:43
Anonymous28-Mar-04 13:43 
GeneralPseudo Code... Pin
blahblah18-Apr-02 10:45
blahblah18-Apr-02 10:45 
GeneralRe: Pseudo Code... Pin
Targys8-Jan-03 2:43
Targys8-Jan-03 2:43 
Ok, I'll try, even though I think it would be a good idea for you to learn C Smile | :)

The first loop basically goes through both strings step by step until there is a * in the wild string.
When ever the characters of the both strings don't match and the character in the wild string is no ? the function returns 0 (FALSE) = no match.
(I'm not a hundred percent sure, 'cause I don't have time to test it, but I guess this loop is for speed reasons only)

The second loop does the hard thing:
if (*wild == '*') {
    if (!*++wild) {
        return 1;
    }
    mp = wild;
    cp = string+1;

This if stores the positions of the string pointers, when *wild is a star
(*wild is the character of wild at the current position of the pointer *wild - easy explanation, not 100% correct)
If this * is the last character in the wild string, it returns 1 (TRUE) = match.
} else {
    wild = mp;
    string = cp++;

This part if the ifs basically solves two things in one.
Firstly it's responsible to increase the pointer position of the string string pointer.
Secondly it returns the two pointers after a wrong go through to the end.
} else if ((*wild == *string) || (*wild == '?')) {
    wild++;
    string++;

This part does the same as the first loop, just after the first *.
while (*wild == '*') {
    wild++;
}

Well, this loop just ingores several * at the end of the wild string.
return !*wild;

And now, that's a nice one Smile | :)
I like it.
After going through all the * in the last loop, the wild string can now contain either
- nothing anymore, that means *wild is NULL, or
- anything but nothing.
Is *wild NULL that means all the comparisons were successful and the function can return 1.
Or easier: it returns !*wild = not NULL = 1 Wink | ;-)
Is it not NULL, but just any character, !*wild will be 0.

So this
return !*wild;

basically replaces
if (*wild = '') {
    return 1;
} else {
    return 0;
}

or something like this.

An example to explain how it really works:
wild is 'bl?h.*g'
string is 'blah.jpgeg'

After the first loop where 'b' is 'b' and 'l' is 'l' and '?' is 'a' and 'h' is 'h' and '.' is '.'
the position of the two pointers *wild and *string look like this:

*wild         |<br />
        'bl?h.*g'<br />
        'blah.jpgeg'<br />
*string       |


Now the second loop starts:
the pointer *string is increased until it points to a character that is the same as the *wild+1.
That means it looks for a 'g' in the string string beginning from the current position.
This increment is done by the last else, as explained above as firstly.
So it will look like this

*wild          |<br />
        'bl?h.*g'<br />
        'blah.jpgeg'<br />
*string         |


Now the second part if the ifs increases both pointers, because 'g' == 'g'.

*wild is now NULL because the g was the last character in the wild string.
*string is 'e'

Because Null != 'e' it sets back the pointers to the values they had before the comparison rush.
This is done again by the last else part, as explained as secondly above.
But the diferrence is now, that the *string pointer is one character further than then.
It looks now like that:

*wild         |<br />
        'bl?h.*g'<br />
        'blah.jpgeg'<br />
*string        |


This change compared to the first time is done by the cp++ of
} else {
    wild = mp;
    string = cp++;

where the pointer cp is incremented.

The same game starts all over again and again it doesn't succeed.
So next time it will look like this:

*wild          |<br />
        'bl?h.*g'<br />
        'blah.jpgeg'<br />
*string         |


and:

*wild          |<br />
        'bl?h.*g'<br />
        'blah.jpgeg'<br />
*string          |


and finally:

*wild          |<br />
        'bl?h.*g'<br />
        'blah.jpgeg'<br />
*string           |


And this will end the loop, as *string will be NULL after the next run.

Well, it's not an easy explanation, as the problem of wildcard search is not really as easy as C++ Guru wants it to have.
(Maybe for a guru, it's easy Smile | :) )

Don't hesitate to ask, if the answer is not understandable.
And of course please correct me, if something is wrong! Smile | :)

Targys
Generaltoo complicated Pin
21-Mar-02 21:31
suss21-Mar-02 21:31 
GeneralRe: too complicated Pin
22-Mar-02 0:23
suss22-Mar-02 0:23 
GeneralRe: too complicated Pin
22-Mar-02 0:34
suss22-Mar-02 0:34 
GeneralRe: too complicated Pin
Jack Handy22-Mar-02 15:23
Jack Handy22-Mar-02 15:23 
GeneralRe: too complicated Pin
22-Mar-02 21:40
suss22-Mar-02 21:40 
GeneralRe: too complicated Pin
Jack Handy22-Mar-02 22:26
Jack Handy22-Mar-02 22:26 
GeneralRe: too complicated Pin
alex.barylski19-Apr-03 9:25
alex.barylski19-Apr-03 9:25 
GeneralRe: too complicated Pin
douglashogan25-Jan-04 12:34
douglashogan25-Jan-04 12:34 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.