 |
|
 |
Test string:
"she sells sea shells by the sea shore"
Wild card find pattern to type in:
"s*a"
To find in Word:
Open Word 2010 and paste the text.
On the home tab in the far right, select the find drop down and select advanced find. In the resulting dialog type "s*a". Click the more button and check use wildcards. Click find next. It will find "she sells sea" as the first match.
The regex pattern generated for s*a is "^s.*a$"
If you test that regex pattern, it comes back with 0 matches.
The current regex pattern looks like it will only get a match when 's' is at the beginning of the string or line and 'a' is at the end of the string or line.
I'm not too good with regex and could use a solution that would find the pattern anywhere in the string. I've tried a few modifications to the existing regex pattern, but desired result not reached yet.
****Update****
Found what I was looking for.
Changing the code to the following did the trick:
public static string WildcardToRegex(string pattern)
{
return Regex.Escape(pattern).
Replace("\\*", ".*?").
Replace("\\?", "."); }
Removed '^', and '$' which says matches need to be at beginning and end of string or line. Changed ".*" to ".*?" - turned 'greedy quantifier' into 'lazy quantifier'. After that, it will still come back with only 2 matches. To compensate for that you could search the string multiple times bumping the start point of the search each time like below:
Wildcard w = new Wildcard("s*a", RegexOptions.CultureInvariant);
string s = "she sells sea shells by the sea shore";
int length = s.Length;
int index = 0;
Match m = w.Match(s);
StringBuilder sb = new StringBuilder();
bool bsuccess = m.Success;
while (bsuccess)
{
sb.AppendLine(m.Value);
index = s.IndexOf(m.Value, index) + 1;
if (index < length)
{
m = w.Match(s, index);
bsuccess = m.Success;
}
else
bsuccess = false;
}
which gives 7 results like Word does:
she sells sea
sells sea
s sea
sea
shells by the sea
s by the sea
sea
modified 5 Mar '12.
|
|
|
|
 |
|
 |
Great article!
Would be nice to see some tricky unit-tests for this as well.
|
|
|
|
 |
|
 |
Hi,
Thanks for the code. I liked it, especially the fact that it was derived from Regex. I use Regex static methods a lot, and hence added these methods to your Wildcard class so that its interface matches more the .NET's Regex class. Here are these methods:
public static Match Match(string input, string pattern, RegexOptions options)
{
string wildcardPattern = WildcardToRegex(pattern);
return Regex.Match(input, wildcardPattern, options);
}
public static Match Match(string input, string pattern)
{
return Match(input, pattern, RegexOptions.None);
}
public static bool IsMatch(string input, string pattern, RegexOptions options)
{
string wildcardPattern = WildcardToRegex(pattern);
return Regex.IsMatch(input, wildcardPattern, options);
}
public static bool IsMatch(string input, string pattern)
{
return IsMatch(input, pattern, RegexOptions.None);
}
public static MatchCollection Matches(string input, string pattern, RegexOptions options)
{
string wildcardPattern = WildcardToRegex(pattern);
return Regex.Matches(input, wildcardPattern, options);
}
public static MatchCollection Matches(string input, string pattern)
{
return Regex.Matches(input, pattern, RegexOptions.None);
}
|
|
|
|
 |
|
 |
The method WildcardToRegex has just saved me from having to write a very messy search class. TVM.
|
|
|
|
 |
|
 |
. is any character in the regular expression on and wildcard search it should still be a '.'.
I would suggest replacing . with \x2E in the regular expression.
Ian
|
|
|
|
 |
|
|
 |
|
 |
It is a nice article. I need to add character specification or limitation [a-z] or [1-9] pattern like that.. is it possible?
Advance Thanks
Boopathi.S
modified on Wednesday, October 21, 2009 10:09 AM
|
|
|
|
 |
|
 |
Hi Devlopers,
Could u tell me the exact difference between http handlers and http modules except their position in pipeline processing
Thanks
GK
|
|
|
|
 |
|
 |
I'm not too sure.
You'd probably get a better answer on the ASP.NET forums[^].
|
|
|
|
 |
|
 |
I have released a new version of the RegEx Tester tool. You can download it free from http://www.codeproject.com/KB/string/regextester.aspx and http://sourceforge.net/projects/regextester
With RegEx Tester you can fully develop and test your regular expression against a target text. It's UI is designed to aid you in the RegEx developing. It uses and supports ALL of the features available in the .NET RegEx Class.
|
|
|
|
 |
|
 |
Thanks for the useful article!
Below a little patch where I have added the matchType parameter to allow more options when adding the ^ or $ characters (see WildcardMatch enum):
Davide
|
|
|
|
 |
|
 |
Correct me if i'm wrong, but according the syntax of wildcards as i know them:
= Exact
* = StartsWith
* = EndsWith
** = Anywhere
So, what's the use of adding the "WildcardMatch matchType" ?
Philippe Dykmans
Software developpement
Advanced Bionics Corp.
|
|
|
|
 |
|
 |
Sorry. The html parser screwed up my first message because i used the wrong brackets. I meant to say that according wildcard syntax:
pattern = Exact
*pattern = EndsWith
pattern* = StartsWith
*pattern* = Exact
And therefore i don't see the need for adding the WildcardMatch enum.
Philippe Dykmans
Software developpement
Advanced Bionics Corp.
|
|
|
|
 |
|
 |
I think that with some reg expressions is better because you can declare exactly your intentions.
For example sometime I want to check if the pattern match the whole text, so in this case you can use Exact. (*pattern* is not the same as Exact, in this case I force that the pattern match from start to end adding ^ and $).
But probably you are right for the StartsWith and EndsWith, they are not very useful.
Davide
|
|
|
|
 |
|
 |
Davide,
Thanks for posting your enhancement. It's exactly what I needed for my application!
Tim
|
|
|
|
 |
|
 |
Good solution.
.net is a box of never ending treasures, every day I get find another gem.
|
|
|
|
 |
|
|
 |
|
|
 |
|
|
 |
|
|
 |
|
 |
I hope you saw this[^] code right?
I haven't gotten around to fixing the article... hahaha.
|
|
|
|
 |
|
 |
Lets say the input string is a*bcdef. The wildcard pattern I want to use is a\**d. I want to match a*bcd. The current code gives incorrect results. Can you suggest a way to make it work if the input string contains meta characters?
Thanks
SK
SK
|
|
|
|
 |
|
 |
Hmm... this is getting more complicated than I thought
I'll have to look at it a bit more closely over the weekend.
|
|
|
|
 |
|
 |
Well, I finally took a look at it. The whole idea of just replacing strings doesn't work; the string needs to be parsed properly.
I don't have time to update the article at the moment. In the meantime, here's the new code. Maybe it's still useful.
private static string WildcardToRegex(string wildcard)
{
StringBuilder sb = new StringBuilder(wildcard.Length + 8);
sb.Append("^");
for (int i = 0; i < wildcard.Length; i++)
{
char c = wildcard[i];
switch(c)
{
case '*':
sb.Append(".*");
break;
case '?':
sb.Append(".");
break;
case '\\':
if (i < wildcard.Length - 1)
sb.Append(Regex.Escape(wildcard[++i].ToString()));
break;
default:
sb.Append(Regex.Escape(wildcard[i].ToString()));
break;
}
}
sb.Append("$");
return sb.ToString();
}
|
|
|
|
 |
|
 |
Fixed using more precise replacing:
string s = "^" + Regex.Escape(pattern) + "$";
s = Regex.Replace(s, @"(?<!\\)\\\*", @".*"); s = Regex.Replace(s, @"\\\\\\\*", @"\*");
s = Regex.Replace(s, @"(?<!\\)\\\?", @"."); s = Regex.Replace(s, @"\\\\\\\?", @"\?");
return Regex.Replace(s, @"\\\\\\\\", @"\\");
This preserves all \*, \? and \\ occurences
|
|
|
|
 |