Click here to Skip to main content
15,887,027 members
Articles / Programming Languages / C#

Coco Custom Tool for Visual Studio.NET

Rate me:
Please Sign up or sign in to vote.
4.64/5 (34 votes)
29 Oct 2005CPOL4 min read 130.9K   699   53   45
Use the award winning Coco compiler's compiler directly from within Visual Studio
Sample Image - vsCoco.png

Introduction

I have been publishing one or two articles about formula evaluation, and so far all programs were written manually. For a recent project, I need to parse far more complex grammars and I really needed some help.

I had a look on the internet and I found a project called Coco/R from the Johannes Kepler Universität Linz. This is how they describe their product : "Coco/R takes a compiler description in the form of an attributed grammar (EBNF syntax with attributes and semantic actions) and translates it into a scanner and a recursive descent parser.... Coco/R has been used successfully in academia and industry. It combines the functionality of the well-known Unix tools Lex and Yacc".

I used Coco for a while and despite being extremely good I found working with it rapidly frustrating because I had to run it manually and it was not really well integrated with Visual Studio.

Background

Anyone wanting to use this tool should be familiar with EBNF grammars. There are several good introductions available on the internet.

I also particularly recommend the reading of the Compiler Generator Coco/R User Manual.

Installing vsCoco

You need to download and run the file vsCocoRegistration.exe. The registration should work fine with Visual Studio 2003.

If you want to use Coco with the given sample, there is nothing else to do.

If you want to use it with your own file, you must add your grammar to your project and set its property to:

  • Build Action: Content
  • Custom Tool: Coco
  • Namespace put what you want or blank
  • Optionally you can provide the Parser.frame and/or Scanner.frame files within your project if you want to customize them.

If it doesn't work with your version, please post in the forum below.

If you are able to fix the vsCoCoRegistration please send me the code pascal_cp@ga$naye.com (remove the $).

Using vsCoco

To start with, you can try to play with the Calculator sample I provided in the download. The calculator calculates formula like 12+34*55/2.

The sample contains only 5 lines of C# code.

C#
private void button1_Click(object sender, System.EventArgs e)
{
        Parser p = new Parser(comboBox1.Text);
        p.Parse();
        textBox1.AppendText(">" + comboBox1.Text + "\r\n" 
                + p.result.ToString() + "\r\n");
}

As you can see, most of the login must be in the Parser Object.
The parser is created automatically from this grammar:

C#
COMPILER calc

    public double result = 0;
 
IGNORECASE 
// The $L option let you compile directly within your grammar
// You can comment and uncomment the line to fit your development requirements.
$L

/*--------------------------------------------------------------------------*/
/*--------------------------------------------------------------------------*/
CHARACTERS
    digit = "0123456789".
    cr  = '\r'.
    lf  = '\n'.  
    tab = '\t'.

TOKENS
    number = digit {digit} ['.' {digit}].
    
// We don't use comments here but this is only a sample
COMMENTS FROM "//" TO cr lf 

IGNORE cr + lf + tab 

PRODUCTIONS

/*--------------------------------------------------------------------------*/
/*--------------------------------------------------------------------------*/
OPERAND<OUT val double>        
=            (.  val = 0; .)
  (
  number         (.    val = Double.Parse(t.val,
                NumberStyles.Float, 
                CultureInfo.InvariantCulture); 
            .)
  | "(" EXPR<OUT val> ")"    
  ).
 
// Priorities in FGL 
//
//       ()        (Parenthesis)
// 10    -        (Unary neg)
// 09    * /        (Multiply and Divide)
// 07    + -        (Add and Subtract)

/*--------------------------------------------------------------------------*/
/*--------------------------------------------------------------------------*/
EXPR10<OUT val double>
=                       (.    bool neg=false; .) 
    {                    
        ( '-'    (.    neg=!neg; .)
        | '+'    (.    /*nothing to do*/ .)
        )
    }
    OPERAND<OUT val>        (.    if (neg) val*=-1; .)
    .

/*--------------------------------------------------------------------------*/
/*--------------------------------------------------------------------------*/
EXPR09<OUT val double>    
= 
    EXPR10<OUT val>        
    {        (.    double val2; .)
        ( '*' 
        EXPR10<OUT val2>    (.    val*=val2; .)
        | '/' 
        EXPR10<OUT val2>    (.    val/=val2; .)
        )
    }                     
    .

/*--------------------------------------------------------------------------*/
/*--------------------------------------------------------------------------*/
EXPR<OUT val double>    
= 
    EXPR09<OUT val>
    {        (.    double val2; .)            
        ( '+'                
        EXPR09<OUT val2>    (.    val+=val2; .)                
        | '-'                
        EXPR09<OUT val2>    (.    val-=val2; .)                
        )
    }                         
    .
  
/*--------------------------------------------------------------------------*/
/*--------------------------------------------------------------------------*/
calc
=                    
    EXPR<OUT result>.

END calc.

This grammar is a fairly standard one. If I try to read it in English, it would say:

  • This grammar will produce a parser called calc.
  • Calc parser return expressions
  • An expression can be a sum or if not, a product of signed numbers.
  • The multiplication should be done before the additions however the minus and plus sign have more priority if they are signs.

As you can see, there is more complexity than it looks. This is a bit hard for me to describe what this grammar does and how it works and this is not my goal.

What I would like to share with you is this tool and hopefully raise an interest for Compiler's compilers if you are new in this subject.

How Does It Work?

Coco Modifications - 1: The #line

I made several major modifications to Coco/R.

First I wanted to trace within the grammar. This was the easy part, I modified Coco source file and added the $L option. If you insert $L in the beginning of your grammar, Coco compiler will add many #line in your code.

For example:

C#
COMPILER calc

    public double result = 0;
 
IGNORECASE 
// The $L option let you compile directly within your grammar
// You can comment and uncomment the line to fit your development requirements.
$L

...

C#
OPERAND<OUT val double>        
=                (.  val = 0; .)
  (
  number             (.    val = Double.Parse(t.val,
                    NumberStyles.Float, 
                    CultureInfo.InvariantCulture); 
                .)
  | "(" EXPR<OUT val> ")"    
  ).

will generate:

C#
    void OPERAND(
#line 31 "C:\dotnet\vsCoco\Calculator\Calc.atg"
        out double val
#line hidden
) {

#line 32 "C:\dotnet\vsCoco\Calculator\Calc.atg"
            val = 0; 
#line hidden

        if (la.kind == 1) {
            Get();

#line 34 "C:\dotnet\vsCoco\Calculator\Calc.atg"
                 val = Double.Parse(t.val,
            System.Globalization.NumberStyles.Float, 
            System.Globalization.CultureInfo.InvariantCulture); 
                            
#line hidden

        } else if (la.kind == 2) {
            Get();
            EXPR(
#line 38 "C:\dotnet\vsCoco\Calculator\Calc.atg"
             out val
#line hidden
);
            Expect(3);
        } else SynErr(9);
    }

This #lines are very helpful, the Visual Studio IDE understands it well and lets you debug your generated program using the original source grammar.

I find it very useful; you can comment and uncomment the $L line to fit your development requirements.

Coco Modifications - 2: A Real Visual Studio Custom Tool

My second goal was to run Coco directly from Visual Studio as a custom tool, rather than having to use batch files.

The main advantage of a custom tool is that it will be automatically called when the source grammar changes and not at each compile.

Visual Studio publishes an interface called IVsSingleFileGenerator.

This interface defines two methods:

  • int DefaultExtension(out string)
  • int Generate(string, string, string, System.IntPtr[], out uint, Microsoft.VisualStudio.Shell.Interop.IVsGeneratorProgress)

Providing these two interfaces is the base of the work needed to make a Visual Studio Custom tool.
With the good information, this is after all fairly straight forward. I used and modified the GotDotNet User Sample: BaseCodeGeneratorWithSite.

Coco Modifications - 3 : An Installer for the Visual Studio Custom Tool

Now that we have a DLL which can be a Visual Studio plugin, you need to register it. This could prove a lot harder than expected. Fortunately, I read a excellent article called Automated Registration of Visual Studio Custom Tools by Michael McKechney.

I butchered his sample program and produced vsCocoRegistration.exe.

Known Bugs

vsCocoRegistration does not yet work with all version of Visual Studio .NET. This is just a question of changing the Registry GUIDs but I don't have that many versions to test it with. So feel free to ask in the forum below.

Links

History

  • October 29th 2005 - First release
  • November 1st 2005 - Corrected a couple of mistakes in the article

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer (Senior)
France France
I am a French programmer.
These days I spend most of my time with the .NET framework, JavaScript and html.

Comments and Discussions

 
QuestionAnyone tried ? Pin
Pascal Ganaye30-Oct-05 4:12
Pascal Ganaye30-Oct-05 4:12 
AnswerRe: Anyone tried ? Pin
Iske30-Oct-05 22:34
Iske30-Oct-05 22:34 
AnswerRe: Anyone tried ? Pin
Iske30-Oct-05 23:04
Iske30-Oct-05 23:04 
GeneralRe: Anyone tried ? Pin
Pascal Ganaye31-Oct-05 2:15
Pascal Ganaye31-Oct-05 2:15 
AnswerRe: Anyone tried ? Pin
neolithos1-Nov-05 9:52
neolithos1-Nov-05 9:52 
GeneralRe: Anyone tried ? Pin
neolithos1-Nov-05 9:57
neolithos1-Nov-05 9:57 
GeneralRe: Anyone tried ? Pin
Pascal Ganaye1-Nov-05 11:57
Pascal Ganaye1-Nov-05 11:57 
AnswerRe: Anyone tried ? Pin
AndyHo2-Nov-05 1:17
professionalAndyHo2-Nov-05 1:17 
Pascal

Hi, you have done a wonderful job with coco/R porting it towards C#!

This was JUST a benediction for me, as I was peeking around with all this compiler-compiler’s and got no good results.

ANSWER: YES! Installed OK into VS2003, no problem, almost instantly usable! Sample provided Wordk ok!..but:

I have some questions on this:

1) I cannot DEBUG/TRACE into the “dynamically” generated code (the scanner does not work for me as expected (at least with my specs).
2) When there is a (coco-compiler) error in the ATG, the error (syntax or whatever) is not displayed and the class is empty yielding in "sintax errors" because of the lacking classes (Scanner/Parser), just not useful to debug, only confusing. the real compiler outpur and warnings are hidden (can they be retrieved/displayed in the output (result) section?) Wink | ;) I worked it around by running in another (XP-cmd) window the coco.exe against the sane ATG, and there I saw the errors to correctly edit my grammar.Frown | :(

SUGGESTION
Is there another way to introduce a line into the “results” of visual studio, and make possibly the cursor jump whew the error are (line/column) ?

A deeper question:

I want to (need) to have an external hook into the Parser, specially the Scanner, because I want to parse a NLP language (Natural Spanish) and this language has a lot of “ambiguities” is not easily convertible into LL(1) and also the words are of more than one kind at all times ( “EL” can be a pronoun or an article ) and many adverbs and adjectives can be used as nouns in a sentence. This may need a hook into the scanner, producing "executabel code" as a token is being recognized.. (much like a PRAGMA but with TOKENS (this wolud be a useful/nice enhancement) Wink | ;)

So my next strategy is to use a highly customizable Scanner/Parse integrating code to disambiguate based on semantic and morphologic relations and accidents like: (number/gender/person/tense)

For accomplishing this I need to first Tag the individual words as all the possible syntactic particles, then pass the parser and check if they build a grammar, testing every (possible and plausible) combination of the function of every word in the sentence, and if the sentence has unknown words, then assign the most (possible) type of function+gender+number to form a reasonable “guessed” phrase/sentence.

This may be accomplished by modifying the internal scanner/parser structure/interface, I guess.

Can you help me with this?Smile | :)

UPGRADING Big Grin | :-D
Checking deeply into your VsCoco code I’ve found something interestingly to be done:

If you use a enumeration for the tokens with the same integer relationship, then in the generation of the syntax you can use the token names w/o having to add the “_” and also this will definitively enhance readability of the written C# code.

FORMER REFERENCES D'Oh! | :doh:
I have used several Compiler-Compiler tools. The best was CsCUP + CsLEX (C# version) they work very good, but they generate separate scanner and parser’s but the internals are very odd. (as they have been ported from C/C++ towards Java and then to C#.. a very common and ugly route)

I really wanted to get a GLR parser or a Early or CYK one., but I have found none in the internet (for C# or Java) but lots of them implemented for “strange+odd+conceptual?” languages like python, happy, earley and perl.Cry | :((

If you like I can send you some of my former work in C# with “grammatica” (Per Cederberg) Suspicious | :suss: where I wrote a interactive grammar editor (with a parser inside to assist the writing of the grammar)

I declined Frown | :( to use grammatica because it is a LL(k) parser and its production’s lookahead is too-lazy (slooowwww) so it is impractical to try something else there. (>10 seconds a parsing of a 10 words phrase with about 500 lookaheads) also It does not let me disambiguate on-the-fly like coco’s WEAK tags.

I was using CUP as this (your work) broke in …and I am testing the integration with VS is greath!!!

Any help appreciated.

Laugh | :laugh:

Andrés Hohendahl
(Argentina)
GeneralRe: Anyone tried ? Pin
Pascal Ganaye2-Nov-05 2:32
Pascal Ganaye2-Nov-05 2:32 
GeneralRe: Anyone tried ? Pin
AndyHo2-Nov-05 3:37
professionalAndyHo2-Nov-05 3:37 
GeneralRe: Anyone tried ? Pin
verdant28-Sep-06 1:50
verdant28-Sep-06 1:50 
GeneralOpenNLP performance (was Re: Anyone tried ?) Pin
Richard Northedge17-Nov-06 9:37
Richard Northedge17-Nov-06 9:37 
AnswerRe: Anyone tried ? Pin
Fedy22-Nov-05 6:22
Fedy22-Nov-05 6:22 
GeneralRe: Anyone tried ? Pin
Pascal Ganaye2-Nov-05 6:42
Pascal Ganaye2-Nov-05 6:42 
GeneralRe: Anyone tried ? Pin
Fedy22-Nov-05 7:08
Fedy22-Nov-05 7:08 
GeneralRe: Anyone tried ? Pin
Pascal Ganaye3-Nov-05 9:14
Pascal Ganaye3-Nov-05 9:14 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.