65.9K
CodeProject is changing. Read more.
Home

Understanding Regular Expressions in .NET

starIconemptyStarIconemptyStarIconemptyStarIconemptyStarIcon

1.00/5 (7 votes)

Nov 5, 2002

1 min read

viewsIcon

76117

I've created a Regex evaluator. It has proven to be extremely helpful. Please feel free to use, and e-mail me if you want the source.

The evaluator can be found here: RegexEvaluate.aspx

The evaluator was a learning process in itself, but well worth it. You'll want to do a little research in the SDK on .NET regular expression syntax. I needed the evaluator to help creat parsing expressions for SQL and HTML. Generating the correct expression would have been almost impossible without this tool to test with. I don't know how much people get into using regular expressions, but they are incredibly useful in a myriad of situations.

Pay attention to using grouping syntax like (?....). Makes a big difference.

The following is direct code I use to parse HTML. I'm a JScript.NET fiend so you'll have to bear with me. I wish I knew a built in .NET way to do this but it hasn't made itself known to me.

	class RegularExpressions {

		static function TagOpen(tagname:String) :String
			{ return '<\\s*(?<tagname>'+tagname+')\\s*(?(?:\\s*\\b\\w+\\b\\s*(?:=\\s*(?:"[^"]*"|\'[^\']*\'|[^"\'<> ]+)\\s*)?)*)/?\\s*>' }
		static function TagClose(tagname:String) :String
			{ return '<\\s*/\\s*(?<tagname>'+tagname+')\\s*>' }
		static function NameValue(name:String) :String
			{ return '(?<name>'+name+')(\\s*=\\s*("(?<value>[^"]*)"|\'(?<value>[^\']*)\'|(?<value>[^"\'<> ]+)))?' }
		static function MLtags(tagname:String) :Regex
			{ return new Regex( TagOpen(tagname)+"|"+TagClose(tagname), RegexOptions.IgnoreCase ) }
		static function MLopentags(tagname:String) :Regex
			{ return new Regex( TagOpen(tagname) ) }
		static function NVpair(name:String) :Regex
			{ return new Regex( NameValue(name), RegexOptions.IgnoreCase )  }
		static const HTMLtags:Regex = MLtags('\\w+')
		static const IMGtags:Regex = MLopentags('IMG')
		static const NameValuePairs:Regex = NVpair('\\w+')
		static const Email:Regex = new Regex( '(?:\w+[.]?)+@\w+(?:[.]\w+)+', RegexOptions.IgnoreCase)

	}
Sorry to leave this one as a puzzle, but you should be able to figure it out if you need it.

The SQL expressions and methods I created are much more complex and I would have a difficult time explaining it to myself now. But I would love for someone to call me an idiot for making these and show me a better way. The HTML parsing was necessary to break html into controls so that certain controls could be replaced with their programmatic counterparts (like an <img> tag). The SQL expressions were created to help in eliminating small differences in SQL statements like capitalization and spacing. And to break down the expression accurately to help in caching data / determining cached data.

I hope this helps...
--Oren