Click here to Skip to main content
Click here to Skip to main content

Wildcard string compare (globbing)

By , 15 Feb 2005
 

Usage:

This is a fast, lightweight, and simple pattern matching function.

if (wildcmp("bl?h.*", "blah.jpg")) {
  //we have a match!
} else {
  //no match =(
}

Function:

int wildcmp(const char *wild, const char *string) {
  // Written by Jack Handy - <A href="mailto:jakkhandy@hotmail.com">jakkhandy@hotmail.com</A>
  const char *cp = NULL, *mp = NULL;

  while ((*string) && (*wild != '*')) {
    if ((*wild != *string) && (*wild != '?')) {
      return 0;
    }
    wild++;
    string++;
  }

  while (*string) {
    if (*wild == '*') {
      if (!*++wild) {
        return 1;
      }
      mp = wild;
      cp = string+1;
    } else if ((*wild == *string) || (*wild == '?')) {
      wild++;
      string++;
    } else {
      wild = mp;
      string = cp++;
    }
  }

  while (*wild == '*') {
    wild++;
  }
  return !*wild;
}

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

About the Author

Jack Handy
Web Developer
United States United States
Member
No Biography provided

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
GeneralC# versionmemberSancy26 Oct '04 - 6:23 
Hi, i have a stupid question, could someone give me the c# version Smile | :)
thanks in advance
GeneralRe: C# version PinsussPsyk6621 Dec '04 - 3:39 
	private bool wildcmp(string wild, string str) 
	{
		int cp=0, mp=0;
	
		int i=0;
		int j=0;
		while ((i<str.Length) && (wild[j] != '*')) 
		{
			if ((wild[j] != str[i]) && (wild[j] != '?')) 
			{
				return false;
			}
			i++;
			j++;
		}
		
		while (i<str.Length) 
		{
			if (j<wild.Length && wild[j] == '*') 
			{
				if ((j++)>=wild.Length) 
				{
					return true;
				}
				mp = j;
				cp = i+1;
			} 
			else if (j<wild.Length && (wild[j] == str[i] || wild[j] == '?')) 
			{
				j++;
				i++;
			} 
			else 
			{
				j = mp;
				i = cp++;
			}
		}
		
		while (j<wild.Length && wild[j] == '*') 
		{
			j++;
		}
		return j>=wild.Length;
	}
 
This C# version works. I'm sure there are loads of improvements to be made though. Don't flame me for such bad code, I only started C# yesterday;)
GeneralRe: C# version PinmemberIonut FIlip22 Feb '05 - 6:15 
A small fix:
   while ((i<str.Length) && (wild[j] != '*'))
should be
   while (i < str.Length && j < wild.Length && wild[j] != '*')
 
And a small improvement for case sensitivity:
private bool wildcmp(string wild, string str, bool case_sensitive)
{
   if (! case_sensitive)
   {
      wild = wild.ToLower();
      str = str.ToLower();
   }
 
   // rest of the code is the same
}

 
Ionut Filip
GeneralRe: C# version Pinmemberrobagar3 Apr '06 - 16:58 
hiya
 
Just thought I'd share my version of this code
 
- put the whole shebang into a class with public static methods
- fixed a bug where the pattern '?' matches all strings
- added an early-exit test for patterns that don't actually contain wildcards so it just defaults to normal string comparison
 
cheers
Rob
 

 

     /// <summary>
     /// Class providing wildcard string matching.
     /// </summary>
     public class Wildcard
     {
          private Wildcard()
          {
          }
 
          /// <summary>
          /// Array of valid wildcards
          /// </summary>
          private static char[] Wildcards = new char[]{'*', '?'};
 
          /// <summary>
          /// Returns true if the string matches the pattern which may contain * and ? wildcards.
          /// Matching is done without regard to case.
          /// </summary>
          /// <param name="pattern"></param>
          /// <param name="s"></param>
          /// <returns></returns>
          public static bool Match(string pattern, string s)
          {
               return Match(pattern, s, false);
          }
 
          /// <summary>
          /// Returns true if the string matches the pattern which may contain * and ? wildcards.
          /// </summary>
          /// <param name="pattern"></param>
          /// <param name="s"></param>
          /// <param name="caseSensitive"></param>
          /// <returns></returns>
          public static bool Match(string pattern, string s, bool caseSensitive)
          {
               // if not concerned about case, convert both string and pattern
               // to lower case for comparison
               if (!caseSensitive)
               {
                    pattern = pattern.ToLower();
                    s = s.ToLower();
               }
 
               // if pattern doesn't actually contain any wildcards, use simple equality
               if (pattern.IndexOfAny(Wildcards) == -1)
                    return (s == pattern);
 
               // otherwise do pattern matching
               int i=0;
               int j=0;
               while (i < s.Length && j < pattern.Length && pattern[j] != '*')
               {
                    if ((pattern[j] != s[i]) && (pattern[j] != '?'))
                    {
                         return false;
                    }
                    i++;
                    j++;
               }
 
               // if we have reached the end of the pattern without finding a * wildcard,
               // the match must fail if the string is longer or shorter than the pattern
               if (j == pattern.Length)
                    return s.Length == pattern.Length;
         
               int cp=0;
               int mp=0;
               while (i < s.Length)
               {
                    if (j < pattern.Length && pattern[j] == '*')
                    {
                         if ((j++)>=pattern.Length)
                         {
                              return true;
                         }
                         mp = j;
                         cp = i+1;
                    }
                    else if (j < pattern.Length && (pattern[j] == s[i] || pattern[j] == '?'))
                    {
                         j++;
                         i++;
                    }
                    else
                    {
                         j = mp;
                         i = cp++;
                    }
               }
         
               while (j < pattern.Length && pattern[j] == '*')
               {
                    j++;
               }
 
               return j >= pattern.Length;
          }
     }

GeneralRe: C# version PinmemberSancy5 Jun '06 - 16:01 
Thanks a lot. This is just what i've been looking for. Smile | :)
 
And it fades like the shadow in the night.
 
PhoeniX
GeneralConvert to java base on C# version [modified, better look :~ ] Pinmemberquangtin321 Mar '08 - 21:13 
Java version
We (Qn & Qg) just search and replace to procedure this java version,
    public static boolean matcher(String value, String pattern) {
        if (pattern == null || value == null) {
            return false;
        }
 
        char[] Wildcards = new char[]{'*', '?'};
 
        pattern = pattern.toLowerCase();
        value = value.toLowerCase();
 
        // if pattern doesn't actually contain any wildcards, use simple equality
        if (pattern.indexOf(Wildcards[0]) == -1 && pattern.indexOf(Wildcards[1]) == -1) {
            return value.equals(pattern);
        }
 
        // otherwise do pattern matching
        int i = 0;
        int j = 0;
        while (i < value.length() && j < pattern.length() && pattern.charAt(j) != '*') {
            if (pattern.charAt(j) != value.charAt(i) && pattern.charAt(j) != '?') {
                return false;
            }
            i++;
            j++;
        }
 
        // if we have reached the end of the pattern without finding a * wildcard,
        // the match must fail if the String is longer or shorter than the pattern
        if (j == pattern.length()) {
            return value.length() == pattern.length();
        }
 
        int cp = 0;
        int mp = 0;
        while (i < value.length()) {
            if (j < pattern.length() && pattern.charAt(j) == '*') {
                if ((j++) >= pattern.length()) {
                    return true;
                }
                mp = j;
                cp = i + 1;
            }
            else if (j < pattern.length() && (pattern.charAt(j) == value.charAt(i) || pattern.charAt(j) == '?')) {
                j++;
                i++;
            }
            else {
                j = mp;
                i = cp++;
            }
        }
 
        while (j < pattern.length() && pattern.charAt(j) == '*') {
            j++;
        }
 
        return j >= pattern.length();
    }
 
Unit test
  public void testmatcher() {
        System.out.println("testmatcher");
 
        String[][] matchPaire = {
            {"", ""},
            {"aa", "aa"},
            {"aa", "*"}, //value,pattern
            {"a", "?"},
            {"sdwerporasl;df", "*"},
            {"absdf zzzy", "*zzy"},
            {"abc", "*?"}};
 
        String[][] notMatchPaire = {
            {"", "?"},
            {"ab", "?"},
            {null, null},
            {"", "*a"},
            {"bsadfasdfwer234", "a*"},
            {"a fwer234", "*a"},
            };
 
        for (int i = 0; i < matchPaire.length; i++) {
            System.out.print("paire " + matchPaire[i][0] + " " + matchPaire[i][1]);
            assertTrue(ExchUtils.matcher(matchPaire[i][0], matchPaire[i][1]));
            System.out.println(" ok");
        }
 
        for (int i = 0; i < notMatchPaire.length; i++) {
            System.out.print("paire " + notMatchPaire[i][0] + " " + notMatchPaire[i][1]);
            assertFalse(ExchUtils.matcher(notMatchPaire[i][0], notMatchPaire[i][1]));
            System.out.println(" ok");
        }
    }
 
thank you all.
 
ktmt's member.
modified on Sunday, March 30, 2008 12:42 PM

GeneralRe: C# version - an error! PinmemberMark T.4 Jul '08 - 14:37 
Be aware that there is a bug in this C# version.
I am still working on figuring it out fully, but:
 
in this code segment
int cp=0;
int mp=0;
while (i < s.Length)
{
  if (j < pattern.Length && pattern[j] == '*')
  {
    if ((j++)>=pattern.Length)
      return true;
 
Going into the final "if" line shown here, the maximum value that j may have is (pattern.length-1), due to the first "if" test. Then we see (j++) compared. But, the value of (j++) is the value of "j" BEFORE being incremented and thus is a maximum of (pattern.length-1) and is therefore NEVER >= pattern.length. Only after the if test is completed is j actually incremented.
So the following return is never taken.
 
Perhaps it can be fixed by changing j++ to ++j... but I can't tell that until I complete the analysis.
 
On a slightly different topic, I will state my opinion as a professional programmer. This demonstrates the extremely importance of EXTENSIVE COMMENTS in code explaining NOT what the code does, but "what the code is supposed to do" in each section. If such comments were in place, this would be an easy maintenance fix. Without them, I am having to analyze what the code DOES and, from that, try to discern what the programmer INTENDED the code to do. And, I have to consider all the possible wildcard permutations just like the original programmer did. I essentially have to reinvent the wheel... because the user manual is missing.
 
Everyone, especially Gurus, should put extensive comments in their code on "what it is intended to do". The only downside is lack of job security, because now someone other than you can fix the code. If you have that low of opinion of your worth to your employer, and are also lacking all compassion for others, then don't comment your code.
GeneralRe: C# version Pinmemberwilliamhix17 Oct '08 - 22:28 
I think this:
 
if ((j++) >= pattern.Length) 
{
  return true;
}
 
Needs to change to this:
 
if (++j >= pattern.Length)
{
 return true;
}
 
Otherwise the early break does not happen and the whole string is searched.

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web01 | 2.6.130516.1 | Last Updated 15 Feb 2005
Article Copyright 2001 by Jack Handy
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid