 |
|
|
 |
|
 |
Great code, but when trying this I realized that the following pattern is a match:
Search: ???????? Text to search: ABC
The problem is that the pattern can be LONGER than the text searched, in which case it should return a not found, but instead returns found.
Also, this example succeeds:
Search: y*n Text to search: yessir
But of course should fail, since I'm looking for a text that ends with n
So I re-wrote your program to this, to correctly handle this situation.
bool StrWildCmp(char* wildstring, char *matchstring){
char stopstring[1]; *stopstring = 0;
while(*matchstring) { if (*wildstring == '*') { if (!*++wildstring) { return true; } else { *stopstring = *wildstring; } }
if(*stopstring) { if(*stopstring == *matchstring ) { wildstring++; matchstring++; *stopstring = 0; } else { matchstring++; } } else if((*wildstring == *matchstring) || (*wildstring == '?')) { wildstring++; matchstring++; } else { return false; }
if(!*matchstring && *wildstring && *wildstring != '*') { // matchstring too short return false; } }
return true; }
Thanks again for the inspiration. 
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
 |
|
|
 |
|
 |
This is the version of the wildcmp function in XBLite programming language:
FUNCTION SBYTE wildcmp( wildcard$, search$) ' wildcmp(const char *wild, const char *string) ' Written by Jack Handy - jakkhandy@hotmail.com
ULONG cp ULONG mp STRING s_txt$ ULONG sp STRING w_txt$ ULONG wp
IFZ search$ THEN RETURN $$FALSE IFZ wildcard$ THEN RETURN $$FALSE w_txt$ = wildcard$ + "\0\0" ' Just to be sure s_txt$ = search$ + "\0\0" DO WHILE (s_txt${sp}) && (w_txt${wp} != '*') IF (w_txt${wp} != s_txt${sp} ) && (w_txt${wp} != '?') THEN RETURN $$FALSE INC wp INC sp LOOP DO WHILE (s_txt${sp}) IF ( w_txt${wp} == '*' ) THEN INC wp IF !(w_txt${wp}) THEN RETURN $$TRUE mp = wp cp = sp + 1 ELSE IF (w_txt${wp} == s_txt${sp} ) || (w_txt${wp} == '?') THEN INC wp INC sp ELSE wp = mp sp = cp IF s_txt${sp} THEN INC cp ENDIF ENDIF LOOP DO WHILE (w_txt${wp} == '*' ) INC wp LOOP RETURN !w_txt${wp}
END FUNCTION
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
 |
I had converted the wildcmp to C#, it's very easy to wildcard string, thanks so much.
bool WildCompare(string strWild, string strEmail) { int cp = 0; int mp = 0;
int wildIndex = 0; int emailIndex = 0;
while ((!ValueIsNullOrEmpty(strEmail, emailIndex)) && (ValueAt(strWild, wildIndex) != '*')) { if ((ValueAt(strWild, wildIndex) != ValueAt(strEmail, emailIndex)) && (ValueAt(strWild, wildIndex) != '?')) { return false; } wildIndex++; emailIndex++; }
while (!ValueIsNullOrEmpty(strEmail,emailIndex)) { if (ValueAt( strWild, wildIndex) == '*') { wildIndex++; if (ValueIsNullOrEmpty(strWild,wildIndex )) { return true; } mp = wildIndex; cp = emailIndex + 1; } else if ((ValueAt(strWild, wildIndex).Equals(ValueAt(strEmail, emailIndex)) || (ValueAt(strWild, wildIndex) == '?'))) { wildIndex++; emailIndex++; } else { wildIndex = mp; emailIndex = cp++; } }
while (ValueAt(strWild, wildIndex) == '*') { wildIndex++; } return ValueIsNullOrEmpty(strWild, wildIndex); }
:~
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
 |
Well, as direct as I could come up with anyway. Makes use of unsafe to enable pointer arithmetic. Unfortunately, because fixed is required to prevent the GC from moving the pointers, I had to change it to use increment indexers instead of directly manipulating the pointers. Alternatively, you could use stackalloc to instantiate two native char[]'s and copy the values, but that seems contrary to this function's low-memory footprint, high performance goals.
Has been tested against every test case presented in the comments section as well as some additional cases I threw in.
public unsafe static bool GlobCompare( string glob, string path ) { fixed ( char* pGlob = glob, pPath = path ) { int pGlobInc = 0; int pPathInc = 0;
int mp = 0; int cp = 0;
while ( ( *( pPath + pPathInc ) != 0 ) && ( *( pGlob + pGlobInc ) != '*' ) ) { if ( ( *( pGlob + pGlobInc ) != *( pPath + pPathInc ) ) && ( *( pGlob + pGlobInc ) != '?' ) ) { return false; } pGlobInc++; pPathInc++; }
while ( *( pPath + pPathInc ) != 0 ) { if ( *( pGlob + pGlobInc ) == '*' ) { if ( 0 == *( pGlob + ++pGlobInc ) ) { return true; } mp = pGlobInc; cp = pPathInc + 1; } else if ( ( *( pGlob + pGlobInc ) == *( pPath + pPathInc ) ) || ( *( pGlob + pGlobInc ) == '?' ) ) { pGlobInc++; pPathInc++; } else { pGlobInc = mp; pPathInc = cp++; } }
while ( *( pGlob + pGlobInc ) == '*' ) { pGlobInc++; } return ( 0 == *( pGlob + pGlobInc ) ); } }
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
 |
|
 |
I am using this in Artistic Style, a popular multi-platform code formatter available at SourceForge.
http://astyle.sourceforge.net/
Release 1.22 added directory recursion to the project. Wildcard processing was made internal to the program. Linux has a glob function but Windows doesn't. I just used this for both of them. It let me process both platforms in a similar manner.
A minor change was made for Windows to make the comparison case insensitive. Linux was left case sensitive.
Thanks for making it available. Using this was a lot easier than writing my own. I doubt that mine would have been this sophisticated.
|
| Sign In·View Thread·PermaLink | 1.00/5 (1 vote) |
|
|
|
 |
|
 |
Boy do I feel stupid. I worked on an algorithm like this for days, and never got it quite right. Then, I see the wonderful, and simplistic work of someone like this, and it reminds me that sometimes we all are guilty of 'over-engineering'...
Thanks Mr. Handy!
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
 |
How can this code be converted to do a replace? I need to provide a find/replace dialog in an application and I don't want to jump through the hoops of the Boost library. Can anyone help?
Patrick
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
 |
Here's RegExp version (may be easily ported to C++). Pros: More readable, Relies on proven RegExp Cons: Maybe slower?, If eval string contains RegExp keywords then it might result in unexpected result
public static bool Match(string eval, string pattern, bool caseSensitive) { bool match = false;
// Make input parameters lower-case if case is not an issue if (!caseSensitive) { eval = eval.ToLower(); pattern = pattern.ToLower(); }
// Escape regexp special character in pattern pattern = pattern.Replace(".", @"\.");
// Replace valid wildcards with regexp equivalents pattern = pattern.Replace('?', '.').Replace("*", ".*");
// Add boundaries to pattern pattern = @"\A" + pattern + @"\z";
// Search for a match try { match = Regex.IsMatch(eval, pattern); } catch /* (ArgumentException ex) */ { // Syntax error in the regular expression }
// Return result return match; }
|
| Sign In·View Thread·PermaLink | 1.83/5 (3 votes) |
|
|
|
 |
|
|
 |
|
 |
Hi,
wildcmp("*<*>", "<field1><field2>") return 1 while I think it should return 0 (I maybe wrong, so please tell me).
If someone knows how to fix it, I will appreciate.
Regards
|
| Sign In·View Thread·PermaLink | 1.00/5 (1 vote) |
|
|
|
 |
|
|
 |
|
 |
I think it's better to make the function return a bool value. Anyway, many string comparision functions return 0 when the strings equal.
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
 |
if wild = "*?.abc", str = "abc.abc" wildcmp(wild, str) not work
but if wild = "?*.abc", str = "abc.abc" wildcmp(wild, str) do work
does anyone have any idea about the case?
|
| Sign In·View Thread·PermaLink | 3.50/5 (2 votes) |
|
|
|
 |
|
 |
Having similar problems with "*Hallo 200? ueberalles*.ddd". It doesn´t work. I think, when the first * is finished, it does not expect an other wildcard in the pattern to follow.
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
 |
Ignore my last email, like usually the problem sits in front of the screen. (I mixed a project built with multibyte Chars with this code which was only chars. And of course I used a Umlaut instead of 'ue' in my tests. So no wonder, why it crashed after the '?' ) I´m very sorry!
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
 |
|
 |
i got the overall flow of the program I didnt get the logic of the second loop completely. I understand that in the second loop it checks if there is nothing after * if so then it is a match but if there is something it stores them in the two pointers and then goes on. also in the final else it goes like else { wild = mp; string = cp++; } am sorry but am not getting the logic totally. can someone please explain?
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
 |
In case you are matching something like the following:
"*.abc" to "ab.de.abc"
In the second loop it looks for the first character after the asterisk that is the same in the string. At first it matches "*" against "ab". mp = ".abc" during this. Now wild = ".abc" and string = ".de.abc". Obvious no match. On the next loop the first characters do match (both '.') and wild becomes "abc" and string "de.abc". The next loop there is no match and it falls to the else. Here it resets wild to the last mp (mask pattern??) and string to the last cp (character pattern) WITHOUT THE FIRST CHARACTER. (It actually advances cp one position.)
Why does it do this. After matching the * against part of the string and encountering a possible poisiton where to match the remainder of the pattern, it continued comparing characters from both to each other. This fialed. Since right before the position of mp there was a *, it is still allowed to add characters to the part that is matched against that. Basically, it goes back to that position but decides that the character that occurs in both strings is not the next character in the pattern but part of the '*' wildcard.
In the end it has matched '*' with 'ab.de'.
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
 |
|
 |
Hello,
i think this post is very interesting because is very simple and make very cool work !
BUT !
I don't understand why you make 3 loop to do it ?
I think i don't see all case, because for me only the 2 loop make all the work ?
I'm trying to understand all the process to add optionnal char with the ^ escape sequence, for exemple : ^-* match -12 or 12 
Thanks
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
 |
DarkYoda Mickael wrote: I don't understand why you make 3 loop to do it ?
I think i don't see all case, because for me only the 2 loop make all the work ?
The third loop:
while (*wild == '*') { wild++; }
is there to take care of trailing *'s. Since * means 0 or more chars, "test*" should match "test" just fine. That loop takes care of this case.
-Jack
There are 10 types of people in this world, those that understand binary and those who don't.
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |