Click here to Skip to main content
11,431,829 members (64,171 online)
Rate this: bad
good
Please Sign up or sign in to vote.
See more: RegEx
Hello, I'm hoping somebody can kickstart my brain on Regular expressions. I have read The 30 Minute Regex Tutorial[^] , but I'm still stuck.

My requirement is to take a string like this:

public abc.def.ghi.MYDATA1 MYDATA2;

and recognise it by:

1. it starts with public
2. it contains .def.ghi, but abc may differ.

I need to extract:

1. MYDATA1
2. MYDATA2

So:

\bpublic\b finds the word public : \bpublic\b
.* will cover the next unknown characters : \bpublic\b.*
def.ghi. is the next known (this could be wrong from here : \bpublic\b.*(def\.ghi\.)

And now I have hit a block, the tester at regex Planet[^] tells me I have already gone wrong.

Help!
Posted 20-Feb-13 12:55pm
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 1

To parse source code, you should:
  • treat the whole file as "singleline", i.e. the Regex should not treat new lines as special characters
  • allow whiespaces between language tokens (all these \s* look a bit ugly, but is necessary to catch all entries)
  • carefully tokenize, e.g. if you only expect not escaped identifier[^], use \w+, otherwise you need to be more creative Wink | ;-)

string text = "..."; // file content
                  // public   Word   .   def    .   ghi    .   Type    Name    ;
string pattern = @"\bpublic\s+\w+\s*\.\s*def\s*\.\s*ghi\s*\.\s*(\w+)\s+(\w+)\s*;";
foreach(Match m in Regex.Matches(text, pattern, RegexOptions.Singleline))
{
    Console.WriteLine("1. {0}", m.Groups[1].Value);
    Console.WriteLine("2. {0}", m.Groups[2].Value);
}

If you have multiple "Word." layers (e.g. A.B.C.def.ghi...), you may extend the pattern as follows:
string pattern = @"\bpublic\s+(?:\w+\s*\.\s*)+def\s*\.\s*ghi\s*\.\s*(\w+)\s+(\w+)\s*;";

Cheers
Andi
  Permalink  
v5
Rate this: bad
good
Please Sign up or sign in to vote.

Solution 2

How about something like this?

public\s+(?'space'\w+\.\w+\.\w+)\.(?'type'\w+)\s+(?'name'\w+);

I prefer to use named capture groups to extract pieces.
  Permalink  

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



Advertise | Privacy | Mobile
Web04 | 2.8.150428.2 | Last Updated 21 Feb 2013
Copyright © CodeProject, 1999-2015
All Rights Reserved. Terms of Service
Layout: fixed | fluid

CodeProject, 503-250 Ferrand Drive Toronto Ontario, M3C 3G8 Canada +1 416-849-8900 x 100