Click here to Skip to main content
15,897,334 members
Please Sign up or sign in to vote.
4.00/5 (1 vote)
I have some strings like this:

Mr. [[husband name]] is married to Mrs. [[wife name]] and they have [[num kids]] kids

I need a pattern that will pull out the text that's both in the brackets and the text that's not in the brackets so my end result is like this:

"Mr. "
"[[husband name]]"
" is married to Mrs. "
"[[wife name]]"
" and they have "
"[[num kids]]"
" kids"

The pattern below grabs the stuff in the brackets just fine (bracketed text will always only be letters, spaces and numbers) but I haven't been able to come up with the OR part that gets the text that's not in the brackets (this text can be made up of any characters).

\[\[[a-zA-Z0-9\s]*\]\]

I think after I get the pattern right I should be able to use that pattern to read the individual broken down strings into an array that I can loop through so I can build the finished string by replacing the bracketed text with some field values. I'm not entirely sure how to get the pattern matches into the array. I tried a few things that didn't seem to work but I think it was mostly because my pattern was wrong. Any help on that would be appreciated as well.

Thanks,

Avian.


Additionally the string being tested may or may not have unbracketed text at the beginning or the end of the string. It could look like this:

[[husband name]] is married to Mrs. [[wife name]] and the number of kids they have is [[num kids]]


Seems like I should just be able to do this:

\[\[[a-zA-Z0-9\s]*\]\] OR NOT \[\[[a-zA-Z0-9\s]*\]\]

but I don't know how to make that a correct regex pattern

This kind of works:

(\[\[[a-zA-Z0-9\s]*\]\])|([^(\[\[)])*

but if there are any [ in the spaces that are not between [[ and ]] then they are excluded.

Just some background: The end user can select from a list of field codes that look like [[husband name]] or [[num kids]] and so on. These are fixed items that can be chosen and dropped into grid cells. The interstitial cells (so to speak) can be any text the user chooses to enter. I of course prevent an end user from entering [[ AND ]] into a user defined cell but I don't want to prevent single "[" or "]" or even double "[[" OR double "]]". Since the field codes are absolutely limited to [a-zA-Z0-9\s] by me, and I can prevent [[ ]] from being entered by the end user, there will always be a distinction between the field codes and the end user entered data. If I have to prevent either [[ or ]] from being entered, I'll do that but I'd rather not limit it.
Posted
Updated 1-Apr-11 3:10am
v3
Comments
Wayne Gaylard 1-Apr-11 8:16am    
Is it necessary to do it with Regular Expressions. Seems to me it would be easier to do by splitting the string. Just a thought.
avianrand 1-Apr-11 8:24am    
Could you explain with an example of what you would do?

Allow for optional text that does not have brackets in it

String yourExampleString = "Place the string you want to parse here";
String regexString = @"([^\[])*(\[\[[a-zA-Z0-9\s]*\]\])?";
Regex regex = new Regex(regexString);
Matches matches = regex.Match(yourExampleString);
foreach(Match match in matches)
{
    //Process the separate matches here
    //By accessing the Groups collection of the current match
    //you can get at the capture groups that were made by the parenthesis
    //in the regular expression
}


Now you have two capture groups: optional leading text without brackets and optional text in brackets. When your match succeeds all you have to do is inspect the content of the first group for leading text and the second group for the bracketed content. You do this for all matches and you're done.

That should do it!

Cheers,
 
Share this answer
 
v4
Comments
avianrand 1-Apr-11 8:25am    
Thanks but that didn't work. Here's what I did:

Dim sTestString As String = "Mr. [[husband name]] is married to Mrs. [[wife name]] and they have [[num kids]] kids"
Dim sTestPattern As String = "([^\[]*(\[\[[a-zA-Z0-9\s]*\]\])?[^\[]*)*"
Dim sTestResult() As String = Regex.Split(sTestString, sTestPattern, RegexOptions.IgnoreCase)

Which returned this:

(0) = ""
(1) = ""
(2) = "[[num kids]]"
(3) = ""
(4) = ""
(5) = ""

Additionally the string being tested may or may not have unbracketed text at the beginning or the end of the string. It could look like this:

[[husband name]] is married to Mrs. [[wife name]] and the number of kids they have is [[num kids]]
Manfred Rudolf Bihy 1-Apr-11 9:53am    
Use the pattern I told you to use and don't to Regex.Split use the method Match. I'll expand on my example.
Sandeep Mewara 1-Apr-11 10:05am    
My 5!
lorenkins 1-Apr-11 11:27am    
Looks like a misplaced paren in your example? I think that pattern should begin with "([^\[]*)" instead of "([^\[])*"
Well I think I figured it out. I made my own little vb.net regex tester and this string:

"[[husband name]] is married to{}*[]]]]^234/*-*-+ Mrs. [[wife name]] and they have [[num kids]] kidsx "

with this pattern:

(\[\[[a-zA-Z0-9\s]*\]\])

renders these results:

""
"[[husband name]]"
" is married to{}*[]]]]^234/*-*-+ Mrs. "
"[[wife name]]"
" and they have "
"[[num kids]]"
" kidsx "

It works no matter what I enter for the string so I think I'm set. All I needed to add to my original pattern was the open and close parens. If anyone can tell me why this might not be the solution, let me know.
 
Share this answer
 
Comments
Manfred Rudolf Bihy 1-Apr-11 9:52am    
The pattern you are showing will never match anything that does not start with [[ so I either you have not replicated the exact pattern you were using or you are not showing the correct results. Take your pick.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900