Click here to Skip to main content
15,075,944 members
Please Sign up or sign in to vote.
1.00/5 (1 vote)
See more:
Regular Expression: (?<=\D|^)(1[358]\d{9})(?=\D|$)

when I use boost regex:

C++
char pat[] = {"(?<=\\D|^)(1[358]\\d{9})(?=\\D|$)"};
boost::regex reg(pat);


here throw an exception!

If I use POSIX, no exception, but can't match, for example:
boost::regex reg(pat, regbase::basic_syntax_group);
std::string str("+86 18601234567 tel.");

bool matched = regex_match(str, reg); // false!!!

who can tell me what's the matter? Pls send email to jianggx@hotmail.com, thanks!

What I have tried:

when I use C#, it's ok.
for example:

C#
string pat = @"(?<=\D|^)(1[358]\d{9})(?=\D|$)";
string myinfo= "custom name: name1;tel:+86 18601234567;addr:beijing.";
Regex rg = new Regex(pattern);
MatchCollection mc = rg.Matches(myinfo);
Posted
Updated 30-Apr-16 4:25am
Comments
PIEBALDconsult 17-Apr-16 13:04pm
   
What is the Exception? Please use "Improve question" to add context and detail.
And don't put your email address in a public forum unless you enjoy spam.
Sergey Alexandrovich Kryukov 17-Apr-16 13:51pm
   
The expression is shown; and the bug is obvious: '?'. I answered the question.
—SA
Sergey Alexandrovich Kryukov 17-Apr-16 13:57pm
   
Thank you for your note; I removed my solution. The engine I tried did not understand '?'. I'll try another one, a very comprehensive ECMAScript's one...

Tried... It also doesn't work, reports "invalid group". Removing or escaping '?' makes it valid.

My conclusion would be: this feature is poorly supported by regex engines, should better be avoided. What do you think?

—SA
PIEBALDconsult 17-Apr-16 14:09pm
   
My understanding is that .net has a lot of "extra" features that maybe should be avoided if one wants to support multiple engines, yes.
I don't know of an actual standard for RegEx.
Sergey Alexandrovich Kryukov 17-Apr-16 14:40pm
   
That's right. Anyway, I tried Mozilla, Opera and Chrome. All show the same error related to '?'.
Note that these engines are very advanced; they support ECMAScript regular expressions, with Unicode and other advanced features.
—SA

1 solution

Sergey was close (?<=\D|^) is called a Positive Lookbehind assertion. It isn't supported in Posix and some versions of Java. C# supports both positive and negative lookbehind.

If you have to use Posix you will need to rework the expression to exclude the positive lookbehind. Nothing is broken it's a feature support issue.
   
v3

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)




CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900