Click here to Skip to main content
15,944,136 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
Dear All,
I have String like below

XML
String x= "SLA - 030', '<description>' (IfcSlab)";


So i need to tokenized that string to 3 part
1st part - SLA - 030
2nd part - <description>
3rd part- IfcSlab
and also those data store to be in 3 variable. Pls help me to do this

Thanks
Ruwan Atapattu
Posted
Updated 28-Nov-13 8:18am
v2
Comments
What have you tried?
Mehdi Gholam 28-Nov-13 14:20pm    
You will have to define the rules for the string breakup.
ridoy 28-Nov-13 15:17pm    
Regex is certainly a better option to do that.

Assuming the following grammar:
line   = record "," tag arg
record = any-char-except-tick+ "'"
tag    = "'" any-char-except-tick+ "'"
arg    = "(" any-char-except-lparen* ")"


Grammar PartRegexComment
full line^...$full line from begin (^) to end ($)
separating white spaces\s*zero or more space, tab, etc.
comma,one comma
record(.+?)'at least on char (.+), as little as possible (?), capturing group ((...)) directly followed by a tick (')
tag'(.+?)'some text enclosed in ticks ('...'), the enclosed text is captured in a group ((...)) and must consist of at least one character (.+), match as little as possible (?)
arg\((.*?)\)some text enclosed in parenthesis (\(...\)), the enclosed text is captured in a group ((...)) and must consist of zero or more characters (.*), match as little as possible (?)


It's easier to use verbatim string literals (@"...") to write the regex pattern since in verbatim string literals the back slash has no special meaning.
C#
var match = Regex.Match(x, @"^\s*(.+?)'\s*,\s*'(.+?)'\s*\((.*?)\)\s*$");


match.Groups[1] contains the record
march.Groups[2] contains the tag
match.Groups[3] contains the arg

(Hint: the group index is the count of left-parenthesis in the pattern, starting at 1)

Cheers
Andi
 
Share this answer
 
Try a Regular Expression and capture the parts in groups.

Here's one that seems to work with the provided input and output: .*"(?'Part1'[^',]*).*'(?'Part2'[^']*)'.*\((?'Part3'[^\)]*)\)

To learn more about Regular Expressions, look into Expresso, and/or have a look at my : RegexTester[^]
 
Share this answer
 
Try a Regex:
(?<first>.*)'[^']+'(?<second>.*)'[^\(]*\((?<third>.*)\)</third></second></first>

It generates three groups:
First:  SLA - 030
Second: <description>
Third:  IfcSlab

You should replace the group names with more descriptive ones though.
 
Share this answer
 
v2
Hi Ruwan

Try this code..



C#
String x= "SLA - 030', '<description>' (IfcSlab)";


           string part1 = x.Substring(0, x.IndexOf('\''));
           string part2 = x.Substring(x.IndexOf('<') - 1, (x.IndexOf('>') +1)  - (x.IndexOf('<')-1));
           string part3 = x.Substring(x.IndexOf('(') +1, x.IndexOf(')')   - (x.IndexOf('(') +1));
 
Share this answer
 
Try this:

C#
String[] delimiterChars = { "\'", ", ", " (", ")" };
string text = "SLA - 030', '<description>' (IfcSlab)";
string[] words = text.Split(delimiterChars, StringSplitOptions.RemoveEmptyEntries);
 
Share this answer
 
v2

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900