Click here to Skip to main content
13,589,109 members
Click here to Skip to main content
Add your own
alternative version


12 bookmarked
Posted 10 Jan 2007
Licenced CPOL

Create Regex Objects using a Kind of "meta-variables" - Quicker and Easier

, 10 Jan 2007
Rate this:
Please Sign up or sign in to vote.
This article describes a class VarRegex allowing you to reuse parts of regular expressions


Regexps (Perl-compatible regular expressions) are great, no doubt (refer to this wonderful article for a tutorial). But the little problem is that every regular expression's pattern should be presented in single string.

For example, suppose we want to specify a pattern for phone number with the following rules:

  • Digits are in groups of 1 or more
  • Spaces and minus sign are used as separators
  • At least one group of digits should be present

How would the appropriate pattern look? Something like this:

// the @ sign is used in C# to prevent parsing \ as escape sequence

This means that we have groups of digits (\d+) followed by a separator, either minus or space ([\s\-]), such groups can occur any (maybe 0) number of times, but at least one group of digits should be present (final \d+). Well, not very difficult, but not very nice at the same time.

Assume, at some moment, the customer says the number may include capital letters (like 1-800-GO-TO-THE-HELL-NOW). We have to change our digit group specification twice.
And if we have some regex for, example, real number in exponential format? Something like this...


... for only one (full) type of record, like 123.456E+120. But we can omit integer or fractional part. Our regex becomes really complex:


Brrr, really?

A Dream

For a long time, I had a dream (:-). A dream to write something like this:

DIGIT = @"[0-9]";
// ` quote is the rare special character
// not having its own meaning in regex syntax
EXP_PART = @"[Ee](0|[+\-]?`INT_PART`)";

// and finally

Well, much more lines of code, but:

  1. Each group of symbols is defined once and reused then, no doubling groups in different parts of pattern.
  2. Each line is much shorter and contains named literals, this makes an expression easier to understand.

This article describes a class created for similar syntax to be used in C# programs. It handles such expressions and returns a Regex object created with expanded pattern.


OK, the idea is as simple as possible. We create a class that allows adding "variables". Each variable can be a single regex expression or regex-like expression with references to previously added variables. Then the pattern is set in the same form. After that, we receive ready Regex object and use it as we like to.

We use ` quote to mark variables. If we want to use the quote itself (maybe someone still needs it :), we can write "\`".

Implementation Details

The class VarRegex is created. It has nested enumerable class VariablesCollection built around a Dictionary<String, String>. This class allows adding and modifying variables using indexer property, retrieving their Count, Clear variables list and enumerating their values. The main VariablesCollection's method is called Expand. It receives a string to be "expanded", looks for variable names occurrences and replaces each variable's reference with its expanded value.

The method is implemented in the following way:

public String Expand(String pattern)
    if (pattern == "")
    return "";

    string p = pattern;
    p = p.Replace("\\`", ""+(char)1);

    r = new Regex("`([^`]+)`");
    MatchCollection ms = r.Matches(p);

    foreach (Match match in ms)
        string t = match.Groups[1].Value;
        p = p.Replace("`" + t + "`", Expand(variables[t]));

    p = p.Replace(""+(char)1, "`");

    return p;

First, we exclude "fake" quotes and slashes. Then we look for all quoted variables' names and replace each name with expanded variable's value. Finally we return all "fake" quotes (without slashes). Well, rather easy. Each time we make some changes to variables or patterns, a Regex object is recreated inside our VarRegex object. The class VariablesCollection also utilizes nested enumerable class ExpandedVariablesCollection, which allows enumerating or receiving by name expanded variables' values.


Now the code for generating regex for phone number from the introduction will look like this:

VarRegex vr = new VarRegex();
vr.Variables["int"] = @"\d+";
vr.Variables["sep"] = @"[\s\-]";
vr.Variables["gr"] = @"`int``sep`";
vr.Pattern = @"`gr`*`int`";
vr.Options = RegexOptions.IgnoreCase;

string str = @"123 568-99";
Match m = vr.Regex.Match(str);
Console.WriteLine("Result for string {1}: {0}\n", m.Success, str);


The main limitation is that variables should be added in the order that they are referenced. It means, the variable should be added to the VarRegex after all variables it references are already added.


  • 10th January, 2007: Initial post


This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


About the Author

Eugene Mirotin (Guard)
Software Developer
Belarus Belarus
No Biography provided

You may also be interested in...

Comments and Discussions

GeneralNice work Pin
Light Walker18-Jan-07 3:22
memberLight Walker18-Jan-07 3:22 
GeneralRe: Nice work Pin
Eugene Mirotin (Guard)18-Jan-07 3:37
memberEugene Mirotin (Guard)18-Jan-07 3:37 
GeneralRe: Nice work Pin
Light Walker18-Jan-07 3:50
memberLight Walker18-Jan-07 3:50 
GeneralRe: Nice work Pin
Eugene Mirotin (Guard)18-Jan-07 3:57
memberEugene Mirotin (Guard)18-Jan-07 3:57 
GeneralLooks like an interesting technique Pin
Garth J Lancaster10-Jan-07 17:17
memberGarth J Lancaster10-Jan-07 17:17 
GeneralRe: Looks like an interesting technique Pin
Eugene Mirotin (Guard)10-Jan-07 22:16
memberEugene Mirotin (Guard)10-Jan-07 22:16 
GeneralRe: Looks like an interesting technique Pin
Clevedon_Peanut15-Jan-07 11:47
memberClevedon_Peanut15-Jan-07 11:47 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

Permalink | Advertise | Privacy | Cookies | Terms of Use | Mobile
Web03 | 2.8.180615.1 | Last Updated 10 Jan 2007
Article Copyright 2007 by Eugene Mirotin (Guard)
Everything else Copyright © CodeProject, 1999-2018
Layout: fixed | fluid