|
|
Comments and Discussions
|
|
 |

|
From 2005 heh - oldies but goodies. Thanks for sharing this.
|
|
|
|

|
This is exactly what I needed. Doing some work with Win32 file functions, this helped with the pattern matching.
|
|
|
|

|
Excellent work... thank you very much
|
|
|
|

|
Usually, the wildcard of *.* means "everything" and not "everything that contains a point", at least under windows when wildcard are used related to file systems.
So if you want some kind of "windows-file-system-wildcard-analogue", you should add somewhere: if (wildcard == "*.*") wildcard = "*";
|
|
|
|
|

|
Test string:
"she sells sea shells by the sea shore"
Wild card find pattern to type in:
"s*a"
To find in Word:
Open Word 2010 and paste the text.
On the home tab in the far right, select the find drop down and select advanced find. In the resulting dialog type "s*a". Click the more button and check use wildcards. Click find next. It will find "she sells sea" as the first match.
The regex pattern generated for s*a is "^s.*a$"
If you test that regex pattern, it comes back with 0 matches.
The current regex pattern looks like it will only get a match when 's' is at the beginning of the string or line and 'a' is at the end of the string or line.
I'm not too good with regex and could use a solution that would find the pattern anywhere in the string. I've tried a few modifications to the existing regex pattern, but desired result not reached yet.
****Update****
Found what I was looking for.
Changing the code to the following did the trick:
public static string WildcardToRegex(string pattern)
{
return Regex.Escape(pattern).
Replace("\\*", ".*?").
Replace("\\?", "."); }
Removed '^', and '$' which says matches need to be at beginning and end of string or line. Changed ".*" to ".*?" - turned 'greedy quantifier' into 'lazy quantifier'. After that, it will still come back with only 2 matches. To compensate for that you could search the string multiple times bumping the start point of the search each time like below:
Wildcard w = new Wildcard("s*a", RegexOptions.CultureInvariant);
string s = "she sells sea shells by the sea shore";
int length = s.Length;
int index = 0;
Match m = w.Match(s);
StringBuilder sb = new StringBuilder();
bool bsuccess = m.Success;
while (bsuccess)
{
sb.AppendLine(m.Value);
index = s.IndexOf(m.Value, index) + 1;
if (index < length)
{
m = w.Match(s, index);
bsuccess = m.Success;
}
else
bsuccess = false;
}
which gives 7 results like Word does:
she sells sea
sells sea
s sea
sea
shells by the sea
s by the sea
sea
modified 5-Mar-12 9:30am.
|
|
|
|

|
Great article!
Would be nice to see some tricky unit-tests for this as well.
|
|
|
|

|
Hi,
Thanks for the code. I liked it, especially the fact that it was derived from Regex. I use Regex static methods a lot, and hence added these methods to your Wildcard class so that its interface matches more the .NET's Regex class. Here are these methods:
public static Match Match(string input, string pattern, RegexOptions options)
{
string wildcardPattern = WildcardToRegex(pattern);
return Regex.Match(input, wildcardPattern, options);
}
public static Match Match(string input, string pattern)
{
return Match(input, pattern, RegexOptions.None);
}
public static bool IsMatch(string input, string pattern, RegexOptions options)
{
string wildcardPattern = WildcardToRegex(pattern);
return Regex.IsMatch(input, wildcardPattern, options);
}
public static bool IsMatch(string input, string pattern)
{
return IsMatch(input, pattern, RegexOptions.None);
}
public static MatchCollection Matches(string input, string pattern, RegexOptions options)
{
string wildcardPattern = WildcardToRegex(pattern);
return Regex.Matches(input, wildcardPattern, options);
}
public static MatchCollection Matches(string input, string pattern)
{
return Regex.Matches(input, pattern, RegexOptions.None);
}
|
|
|
|

|
Are the two-parameter methods for matching wildcards missing the wildcardPattern? Wouldn't they just do a RegEx match?
string wildcardPattern = WildcardToRegex(pattern);
return Regex.BlahBlah(input, wildcardPattern, options);
www.CADbloke.com
The Broadcast Systems Documentation SYSTEM
"The mass of men lead lives of quiet desperation"
-Zen & the Art of Motorcycle Maintenance
|
|
|
|

|
The method WildcardToRegex has just saved me from having to write a very messy search class. TVM.
|
|
|
|

|
. is any character in the regular expression on and wildcard search it should still be a '.'.
I would suggest replacing . with \x2E in the regular expression.
Ian
|
|
|
|
|

|
It is a nice article. I need to add character specification or limitation [a-z] or [1-9] pattern like that.. is it possible?
Advance Thanks
Boopathi.S
modified on Wednesday, October 21, 2009 10:09 AM
|
|
|
|

|
Hi Devlopers,
Could u tell me the exact difference between http handlers and http modules except their position in pipeline processing
Thanks
GK
|
|
|
|

|
I'm not too sure.
You'd probably get a better answer on the ASP.NET forums[^].
|
|
|
|

|
I have released a new version of the RegEx Tester tool. You can download it free from http://www.codeproject.com/KB/string/regextester.aspx and http://sourceforge.net/projects/regextester
With RegEx Tester you can fully develop and test your regular expression against a target text. It's UI is designed to aid you in the RegEx developing. It uses and supports ALL of the features available in the .NET RegEx Class.
|
|
|
|

|
Thanks for the useful article!
Below a little patch where I have added the matchType parameter to allow more options when adding the ^ or $ characters (see WildcardMatch enum):
public class Wildcard : Regex
{
public Wildcard(string pattern, WildcardMatch matchType)
: base(WildcardToRegex(pattern, matchType))
{
}
public Wildcard(string pattern, RegexOptions options, WildcardMatch matchType)
: base(WildcardToRegex(pattern, matchType), options)
{
}
public static string WildcardToRegex(string pattern, WildcardMatch matchType)
{
string escapedPattern = Regex.Escape(pattern);
escapedPattern = escapedPattern.Replace("\\*", ".*");
escapedPattern = escapedPattern.Replace("\\?", ".");
if (matchType == WildcardMatch.Anywhere)
return escapedPattern;
else if (matchType == WildcardMatch.EndsWith)
return escapedPattern + "$";
else if (matchType == WildcardMatch.StartsWith)
return "^" + escapedPattern;
else if (matchType == WildcardMatch.Exact)
return "^" + escapedPattern + "$";
else
throw new ArgumentOutOfRangeException("matchType");
}
}
public enum WildcardMatch
{
Exact = 0,
Anywhere = 1,
StartsWith = 2,
EndsWith = 3
}
Davide
|
|
|
|

|
Correct me if i'm wrong, but according the syntax of wildcards as i know them:
= Exact
* = StartsWith
* = EndsWith
** = Anywhere
So, what's the use of adding the "WildcardMatch matchType" ?
Philippe Dykmans
Software developpement
Advanced Bionics Corp.
|
|
|
|

|
Sorry. The html parser screwed up my first message because i used the wrong brackets. I meant to say that according wildcard syntax:
pattern = Exact
*pattern = EndsWith
pattern* = StartsWith
*pattern* = Exact
And therefore i don't see the need for adding the WildcardMatch enum.
Philippe Dykmans
Software developpement
Advanced Bionics Corp.
|
|
|
|

|
I think that with some reg expressions is better because you can declare exactly your intentions.
For example sometime I want to check if the pattern match the whole text, so in this case you can use Exact. (*pattern* is not the same as Exact, in this case I force that the pattern match from start to end adding ^ and $).
But probably you are right for the StartsWith and EndsWith, they are not very useful.
Davide
|
|
|
|

|
Davide,
Thanks for posting your enhancement. It's exactly what I needed for my application!
Tim
|
|
|
|

|
Good solution.
.net is a box of never ending treasures, every day I get find another gem.
|
|
|
|
|
|
|
|

|
I hope you saw this[^] code right?
I haven't gotten around to fixing the article... hahaha.
|
|
|
|

|
Lets say the input string is a*bcdef. The wildcard pattern I want to use is a\**d. I want to match a*bcd. The current code gives incorrect results. Can you suggest a way to make it work if the input string contains meta characters?
Thanks
SK
SK
|
|
|
|

|
Hmm... this is getting more complicated than I thought
I'll have to look at it a bit more closely over the weekend.
|
|
|
|

|
Well, I finally took a look at it. The whole idea of just replacing strings doesn't work; the string needs to be parsed properly.
I don't have time to update the article at the moment. In the meantime, here's the new code. Maybe it's still useful.
private static string WildcardToRegex(string wildcard)
{
StringBuilder sb = new StringBuilder(wildcard.Length + 8);
sb.Append("^");
for (int i = 0; i < wildcard.Length; i++)
{
char c = wildcard[i];
switch(c)
{
case '*':
sb.Append(".*");
break;
case '?':
sb.Append(".");
break;
case '\\':
if (i < wildcard.Length - 1)
sb.Append(Regex.Escape(wildcard[++i].ToString()));
break;
default:
sb.Append(Regex.Escape(wildcard[i].ToString()));
break;
}
}
sb.Append("$");
return sb.ToString();
}
|
|
|
|

|
Fixed using more precise replacing:
string s = "^" + Regex.Escape(pattern) + "$";
s = Regex.Replace(s, @"(?<!\\)\\\*", @".*"); s = Regex.Replace(s, @"\\\\\\\*", @"\*");
s = Regex.Replace(s, @"(?<!\\)\\\?", @"."); s = Regex.Replace(s, @"\\\\\\\?", @"\?");
return Regex.Replace(s, @"\\\\\\\\", @"\\");
This preserves all \*, \? and \\ occurences
|
|
|
|

|
You may wish to consider adding "$" to the end of the Regex to get behavior that fully matches normal wildcard searching (maybe ^ at the beginning as well).
As presented: A pattern of "*.dll" will find files named "abc.dll.tmp" for example. Hmmmm ... I wonder why have files of that form in my WINNT\System32 directory...TBD.
Otherwise, a neat trick. I'd like to see more short, useful, items on CodeProject.
|
|
|
|

|
Inquiring minds might like to know that performing the equivalent file name pattern matching using the VB.Net Like operator does the same match in 60 per cent of the time. Regexs are very powerful, but not without a cost.
I assumed that there was a cost to using Regex for simple matches so I got inspired to try it out. Do each over 2700 file names (about 1300 matches) 100 times, throw out the top & bottom 10 scores -- Like wins by 40 per cent.
Average times: Ignoring case in both, compiling the Regex on initiation of
wildcard class.
Like =.0083 seconds
Regex=.0132 seconds
Either one gets over 2700 comparisons very quickly.
Jim Parsells
|
|
|
|

|
No doubt, regexes are not the most efficient thing in the world. Have you tried creating with RegexOptions.Compile? That should be a tad better.
There's a C++ version and C# port of the Match function here[^], which you'd probably want to use if performance is crucial. I should probably have mentioned this in the article.
Thanks!
|
|
|
|

|
Actually I did use RegexOptions.Compile. Actually, I was just idly curious, and had been for a while, about quantifying the difference. Seeing your article just prompted me to do that.
Bottom line -- either approach is plenty fast enough for limited use. However, if not using Regex otherwise, why pay for the additional dll loads. Each method has its place, but Regex will do so much more -- if you need that functionality.
Jim Parsells
|
|
|
|

|
Hmm it'd be interesting to produce IL based on regexes the way RegexOptions.Compile does... that'd beat everything, though probably not for much gains
|
|
|
|

|
Ah, yeah, good point. That's definitely crucial. I'll change that.
Thanks!
|
|
|
|

|
Pretty cool trick you pooled with with one. Nicde and easy . I like it
|
|
|
|
|
 |
|
|
General News Suggestion Question Bug Answer Joke Rant Admin
Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.
|
Ever wondered how to do wildcards in C#?
| Type | Article |
| Licence | |
| First Posted | 6 Sep 2005 |
| Views | 114,656 |
| Bookmarked | 53 times |
|
|