Click here to Skip to main content
Licence 
First Posted 10 Mar 2003
Views 196,212
Bookmarked 64 times

Multiple Language Syntax Highlighting, Part 2: C# Control

By | 12 Mar 2003 | Article
Fast and furious colorizing library for source code (C, C++, VBScript, JScript, xml, etc.)

Introduction

This article is an upgrade of the code submitted in Multiple Language Syntax Highlighting, Part 1: JScript, where a syntax highlighting scheme was proposed.

The technique and ideas for parsing have not change and, therefore, I will not explain the parsing/rendering process in this article. The user who would need more detailled can refer the article cited above. I must also point out that this article is intended to replace entirely the Javascript code in a ( near ?) future. 

As the previous article was an exercice to learn JScript, XSL and regular expression, I used this one to get a first contact with C#.

In the rest of the article, I will refer to the Javascript version as v1.0 and the C# as v2.0.

Moving to C#

As a C++ developper, I can tell you I was glad to quit JavaScript and get started with C# who had a much better (C++) flavour.

Wrapping of the JScript methods in a single C# was quite straightforward and doesn't not deserve much comments.  

CodeColorizer Class

This class is the kernel of the parser. You can colorize code using CodeColorizer.ProcessAndHighlightCode( string ).

Having that job done and the ported code running after fairly small time, it was time to use the power of C# and get things better.

New features

Avoiding Regular Expression Object Construction

In the v1.0, regular expression objects were created each time the parser would change context, although the regular expression string was remaining the same. This was leading to a great number of allocation-compilation of Regex objects (although I have question about object pooling, see Open question below).

A first improvement of the library was to store the Regex objects into a HashTable when parsing the syntax. The class implementing this dictionary is Collections.RegexDictionary.

Hence, when parsing, regular expression object do not need to be built and can be retreived in constant time from the table.

Open Question: does .NET cache regular expression strings in a pool ?

Handling the Case

The case sensitivity of a language can be specified using the argument not-case-sensitive={"yes" or "no" (default)} with the node language.

Bencharkming

The parser contains a timer/counter ( see [1] for details ) to bench the transformation. At the end of the article, some benchmarking results are presented. 

Bencharkming quantities are:

  • CodeColorizer.BenchmarkPerChar who returns the number of second to parse a character.
  • CodeColorizer.BenchmarkAvgSec, the parsing time average,
  • CodeColorizer.BenchmarkSec, the last job parsing time

Easier Integration

The library comes with a custom web control that colorizes text.

The Project:

The projects shows the usage of the custom colorizer control. For further details, NDOC documentation has been generated.

You must modify web.config to specify where the xml, xsl files are. See ColorizerLibrary section.

TODO List

Reference

[1] High Performance timer in C#
[2] Multiple Language Syntax Highlighting, Part 1: JScript

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

About the Author

Jonathan de Halleux

Engineer

United States United States

Member

Jonathan de Halleux is Civil Engineer in Applied Mathematics. He finished his PhD in 2004 in the rainy country of Belgium. After 2 years in the Common Language Runtime (i.e. .net), he is now working at Microsoft Research on Pex (http://research.microsoft.com/pex).

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board. (secure sign-in)
 
Search this forum  
 FAQ
    Noise  Layout  Per page   
  Refresh
GeneralDistributing modified code PinmemberEric Woodruff20:12 15 Nov '06  
Generalrunning the demo PinmemberOri-3:38 2 Sep '06  
Generalsource code Pinmemberzikha0:33 14 Aug '06  
hi. is the source code for this one available? only demo can be downloaded. Or do we have to make modifications to the java code? thanks

QuestionHow to integrate into the IDE PinsussAnonymous8:05 17 Dec '04  
GeneralC# Syntax PinmemberBassam Abdul-Baki3:24 27 Apr '04  
GeneralRe: C# Syntax PinmemberJonathan de Halleux3:40 27 Apr '04  
GeneralRe: C# Syntax PinmemberBassam Abdul-Baki7:20 27 Apr '04  
GeneralRe: C# Syntax PinmemberJonathan de Halleux22:19 28 Apr '04  
GeneralRe: C# Syntax PinmemberBassam Abdul-Baki2:09 29 Apr '04  
GeneralRe: C# Syntax PinmemberJonathan de Halleux2:14 29 Apr '04  
GeneralHay alguna api para controlar llmadas telefonicas con net2phone PinmemberMarcelCH6:10 1 Apr '04  
GeneralRe: Calling C# functions from MFC/C++ PinmemberAlex Evans9:27 16 Feb '04  
GeneralRe: Calling C# functions from MFC/C++ PinmemberDave Bacher9:01 3 Mar '06  
GeneralCalling C# functions from MFC/C++ PinmemberAlex Evans19:10 15 Feb '04  
GeneralRe: Calling C# functions from MFC/C++ PinmemberJonathan de Halleux21:54 15 Feb '04  
QuestionRTF? PinmemberBeater18:58 23 Jun '03  
AnswerRe: RTF? PinmemberJonathan de Halleux21:07 23 Jun '03  
GeneralRe: RTF? PinmemberRicardo Mendes8:01 6 Oct '03  
GeneralRe: RTF? PinmemberJonathan de Halleux8:32 6 Oct '03  
GeneralOther langages supported PinsussDD le postier4:04 31 Mar '03  
GeneralRe: Other langages supported PinmemberJonathan de Halleux13:30 28 Jul '03  
GeneralGood stuff PinsussRudi Larno21:07 12 Mar '03  
GeneralWhen??? PinmemberJonathan de Halleux0:37 13 Mar '03  
GeneralIn answer to your open question... PinmemberCeiled5:56 12 Mar '03  
GeneralThough! PinmemberJonathan de Halleux7:24 12 Mar '03  

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

Permalink | Advertise | Privacy | Mobile
Web02 | 2.5.120529.1 | Last Updated 13 Mar 2003
Article Copyright 2003 by Jonathan de Halleux
Everything else Copyright © CodeProject, 1999-2012
Terms of Use
Layout: fixed | fluid