Regex can not handle nested problems (by definition).
You need to tokenize the stream and handle the nesting by pasing the tokens. The tokenizing can be done by Regex, though.
In the simples situation, you have simple stream of characters. Use the approach as suggested in soulton 1.
If the file is a programming language, you must tokenize according to the language in order to get reasonable results.
For your case of pair-wise elements, you could tokenize as follows:
1. comments
2. string literals
3. character literals
4. { and }
5. rest (individual irrelevant characters for the given problem)
Then you write a simple parser that eats up all tokens one after the other and increment with the opening { and decrement with the closing } .
Postcondition: counter == 0 or error.
And here comes the code for C#:
string file = @"your-full-path-to-thissource-file.cs";
string data = File.ReadAllText(file);
string cmt = @"//.*?$|/\*[\s\S]*?\*/";
string str = @"@""(?:""""|[\s\S])*?""|""(?:\.|.)*?""";
string chr = @"'[^']*?'";
string rex = "(?:" + cmt +"|"+str+"|"+chr+@")|([{}])|[\s\S]";
Regex tokens = new Regex(rex, RegexOptions.Compiled | RegexOptions.Multiline);
var q = from m in tokens.Matches(data).Cast<Match>()
where m.Groups[1].Success
select m.Groups[1].Value == "{";
int level = 0;
foreach (bool plus in q)
{
Console.WriteLine("{0}", plus);
level += plus ? 1 : -1;
}
Console.WriteLine("level = {0}", level);
Have fun!
BTW: If it's a homework assignment, try to explain it to your teacher ;-)
Cheers
Andi