Regular Expressions Syntax Highlighting






4.34/5 (17 votes)
Easy to use syntax highlighting class

Introduction
The classes described in this article provide basic mechanism for colorizing of source code. Highlighting rules are defined by regular expressions and colorizing functionality is easy to implement in your application. There's no multilanguage support (JavaScript and HTML in the same block of text), but simple languages such as SQL could be added seamlessly.
Background
I've been writing an application for database stress testing and felt that it's natural to have some syntax highlighting for the queries written by a user. I was looking for something similar to SynEdit (an advanced edit control for Borland Delphi and Kylix), but could not find a suitable solution in reasonable time, so I decided to write one myself. Maybe I did not look hard enough, but it's always more fun to create something than to use something.
Using the Code
There are four classes in this project:
SyntaxHighlightSegment
- A simple class similar toSystem.Text.RegularExpressions.Capture
. It represents a substring in a block of text specified by position.SyntaxHighlightSegmentList
- A list ofSyntaxHighlightSegment
introducing methods to remove overlapping segments from itself or another list.SyntaxHighlightItem
- Serves like a definition for language entity (literal string, keyword, comment, operator, numerical value, etc.) you wish to highlight.SyntaxHighlightDictionary
- A list ofSyntaxHighlightItem
that provides functionality of handling several items together.
The main class to define your highlighting rules is SyntaxHighlightItem
. You'll need to understand regular expressions to do this or you can just look up one. For example, if you wish to highlight comments starting with double dash until the end of the line with gray italic font on transparent background, you would do the following:
SyntaxHighlightItem commentItem =
new SyntaxHighlightItem("comments", new string[] { "--.*"},
FontStyle.Italic, Color.Gray, Color.Transparent);
Once you've created all your definitions, just add them to SyntaxHighlightDictionary
instance. It will handle several definitions together, for example it will ignore "literal string" item within "comments" one:
SyntaxHighlightDictionary dic = new SyntaxHighlightDictionary();
SyntaxHighlightItem commentItem =
new SyntaxHighlightItem("comments", new string[] { "--.*"},
FontStyle.Italic, Color.Gray, Color.Transparent);
SyntaxHighlightItem literalStringItem = new SyntaxHighlightItem("strings",
new string[] { "'[^'\r\n]*'" },
FontStyle.Regular, Color.Blue, Color.Transparent);
dic.Add(commentItem);
dic.Add(literalStringItem);
dic.CreateSnapshot("unmarked text--'string' inside comments");
The same functionality could probably be achieved by specifying regular expressions more carefully, but for those who (like myself) can't get a hold of Regex string
s, it's much easier this way.
CreateSnapshot
method in the example above will analyze the text and fill the information about items found into segments structure. So, to highlight contents of RichTextBox
textBox
, you would do something like this:
Dictionary.CreateSnapshot(textBox.Text);
foreach(SyntaxHighlightItem synItem in Dictionary) {
foreach(SyntaxHighlightSegment segment in synItem.AllSegments) {
textBox.Select(segment.OrderedStart, segment.Length);
textBox.SelectionFont = new Font(Dictionary.Font, synItem.FontStyle);
textBox.SelectionColor = synItem.ForegroundColor;
textBox.SelectionBackColor = synItem.BackgroundColor;
}
}
You will find full code for a class to colorize RichTextBox
in the demo project.
Known Issues
Regular expressions in the demo project are far from perfect and if used as is could produce unexpected results. Performance is fine for my needs (up to 500 lines of code), but can certainly be improved.
Points of Interest
As you will see from the comments, I tried to work around flickering when highlighting RichTextBox
. There's no BeginUpdate()
/EndUpdate()
methods, so they are emulated by WM_SETREDRAW
message. The problem here is to determine the original state of the RichTextBox
and not allow it to redraw if it was disabled by another call external to your code. You might want to search for AdvRichTextBox
that extends RichTextBox
to address this problem.
History
- 7th September, 2008 - Initial release