Click here to Skip to main content
15,867,308 members
Articles / Programming Languages / C#

Coco Custom Tool for Visual Studio.NET

Rate me:
Please Sign up or sign in to vote.
4.64/5 (34 votes)
29 Oct 2005CPOL4 min read 130.2K   697   53   45
Use the award winning Coco compiler's compiler directly from within Visual Studio
Sample Image - vsCoco.png

Introduction

I have been publishing one or two articles about formula evaluation, and so far all programs were written manually. For a recent project, I need to parse far more complex grammars and I really needed some help.

I had a look on the internet and I found a project called Coco/R from the Johannes Kepler Universität Linz. This is how they describe their product : "Coco/R takes a compiler description in the form of an attributed grammar (EBNF syntax with attributes and semantic actions) and translates it into a scanner and a recursive descent parser.... Coco/R has been used successfully in academia and industry. It combines the functionality of the well-known Unix tools Lex and Yacc".

I used Coco for a while and despite being extremely good I found working with it rapidly frustrating because I had to run it manually and it was not really well integrated with Visual Studio.

Background

Anyone wanting to use this tool should be familiar with EBNF grammars. There are several good introductions available on the internet.

I also particularly recommend the reading of the Compiler Generator Coco/R User Manual.

Installing vsCoco

You need to download and run the file vsCocoRegistration.exe. The registration should work fine with Visual Studio 2003.

If you want to use Coco with the given sample, there is nothing else to do.

If you want to use it with your own file, you must add your grammar to your project and set its property to:

  • Build Action: Content
  • Custom Tool: Coco
  • Namespace put what you want or blank
  • Optionally you can provide the Parser.frame and/or Scanner.frame files within your project if you want to customize them.

If it doesn't work with your version, please post in the forum below.

If you are able to fix the vsCoCoRegistration please send me the code pascal_cp@ga$naye.com (remove the $).

Using vsCoco

To start with, you can try to play with the Calculator sample I provided in the download. The calculator calculates formula like 12+34*55/2.

The sample contains only 5 lines of C# code.

C#
private void button1_Click(object sender, System.EventArgs e)
{
        Parser p = new Parser(comboBox1.Text);
        p.Parse();
        textBox1.AppendText(">" + comboBox1.Text + "\r\n" 
                + p.result.ToString() + "\r\n");
}

As you can see, most of the login must be in the Parser Object.
The parser is created automatically from this grammar:

C#
COMPILER calc

    public double result = 0;
 
IGNORECASE 
// The $L option let you compile directly within your grammar
// You can comment and uncomment the line to fit your development requirements.
$L

/*--------------------------------------------------------------------------*/
/*--------------------------------------------------------------------------*/
CHARACTERS
    digit = "0123456789".
    cr  = '\r'.
    lf  = '\n'.  
    tab = '\t'.

TOKENS
    number = digit {digit} ['.' {digit}].
    
// We don't use comments here but this is only a sample
COMMENTS FROM "//" TO cr lf 

IGNORE cr + lf + tab 

PRODUCTIONS

/*--------------------------------------------------------------------------*/
/*--------------------------------------------------------------------------*/
OPERAND<OUT val double>        
=            (.  val = 0; .)
  (
  number         (.    val = Double.Parse(t.val,
                NumberStyles.Float, 
                CultureInfo.InvariantCulture); 
            .)
  | "(" EXPR<OUT val> ")"    
  ).
 
// Priorities in FGL 
//
//       ()        (Parenthesis)
// 10    -        (Unary neg)
// 09    * /        (Multiply and Divide)
// 07    + -        (Add and Subtract)

/*--------------------------------------------------------------------------*/
/*--------------------------------------------------------------------------*/
EXPR10<OUT val double>
=                       (.    bool neg=false; .) 
    {                    
        ( '-'    (.    neg=!neg; .)
        | '+'    (.    /*nothing to do*/ .)
        )
    }
    OPERAND<OUT val>        (.    if (neg) val*=-1; .)
    .

/*--------------------------------------------------------------------------*/
/*--------------------------------------------------------------------------*/
EXPR09<OUT val double>    
= 
    EXPR10<OUT val>        
    {        (.    double val2; .)
        ( '*' 
        EXPR10<OUT val2>    (.    val*=val2; .)
        | '/' 
        EXPR10<OUT val2>    (.    val/=val2; .)
        )
    }                     
    .

/*--------------------------------------------------------------------------*/
/*--------------------------------------------------------------------------*/
EXPR<OUT val double>    
= 
    EXPR09<OUT val>
    {        (.    double val2; .)            
        ( '+'                
        EXPR09<OUT val2>    (.    val+=val2; .)                
        | '-'                
        EXPR09<OUT val2>    (.    val-=val2; .)                
        )
    }                         
    .
  
/*--------------------------------------------------------------------------*/
/*--------------------------------------------------------------------------*/
calc
=                    
    EXPR<OUT result>.

END calc.

This grammar is a fairly standard one. If I try to read it in English, it would say:

  • This grammar will produce a parser called calc.
  • Calc parser return expressions
  • An expression can be a sum or if not, a product of signed numbers.
  • The multiplication should be done before the additions however the minus and plus sign have more priority if they are signs.

As you can see, there is more complexity than it looks. This is a bit hard for me to describe what this grammar does and how it works and this is not my goal.

What I would like to share with you is this tool and hopefully raise an interest for Compiler's compilers if you are new in this subject.

How Does It Work?

Coco Modifications - 1: The #line

I made several major modifications to Coco/R.

First I wanted to trace within the grammar. This was the easy part, I modified Coco source file and added the $L option. If you insert $L in the beginning of your grammar, Coco compiler will add many #line in your code.

For example:

C#
COMPILER calc

    public double result = 0;
 
IGNORECASE 
// The $L option let you compile directly within your grammar
// You can comment and uncomment the line to fit your development requirements.
$L

...

C#
OPERAND<OUT val double>        
=                (.  val = 0; .)
  (
  number             (.    val = Double.Parse(t.val,
                    NumberStyles.Float, 
                    CultureInfo.InvariantCulture); 
                .)
  | "(" EXPR<OUT val> ")"    
  ).

will generate:

C#
    void OPERAND(
#line 31 "C:\dotnet\vsCoco\Calculator\Calc.atg"
        out double val
#line hidden
) {

#line 32 "C:\dotnet\vsCoco\Calculator\Calc.atg"
            val = 0; 
#line hidden

        if (la.kind == 1) {
            Get();

#line 34 "C:\dotnet\vsCoco\Calculator\Calc.atg"
                 val = Double.Parse(t.val,
            System.Globalization.NumberStyles.Float, 
            System.Globalization.CultureInfo.InvariantCulture); 
                            
#line hidden

        } else if (la.kind == 2) {
            Get();
            EXPR(
#line 38 "C:\dotnet\vsCoco\Calculator\Calc.atg"
             out val
#line hidden
);
            Expect(3);
        } else SynErr(9);
    }

This #lines are very helpful, the Visual Studio IDE understands it well and lets you debug your generated program using the original source grammar.

I find it very useful; you can comment and uncomment the $L line to fit your development requirements.

Coco Modifications - 2: A Real Visual Studio Custom Tool

My second goal was to run Coco directly from Visual Studio as a custom tool, rather than having to use batch files.

The main advantage of a custom tool is that it will be automatically called when the source grammar changes and not at each compile.

Visual Studio publishes an interface called IVsSingleFileGenerator.

This interface defines two methods:

  • int DefaultExtension(out string)
  • int Generate(string, string, string, System.IntPtr[], out uint, Microsoft.VisualStudio.Shell.Interop.IVsGeneratorProgress)

Providing these two interfaces is the base of the work needed to make a Visual Studio Custom tool.
With the good information, this is after all fairly straight forward. I used and modified the GotDotNet User Sample: BaseCodeGeneratorWithSite.

Coco Modifications - 3 : An Installer for the Visual Studio Custom Tool

Now that we have a DLL which can be a Visual Studio plugin, you need to register it. This could prove a lot harder than expected. Fortunately, I read a excellent article called Automated Registration of Visual Studio Custom Tools by Michael McKechney.

I butchered his sample program and produced vsCocoRegistration.exe.

Known Bugs

vsCocoRegistration does not yet work with all version of Visual Studio .NET. This is just a question of changing the Registry GUIDs but I don't have that many versions to test it with. So feel free to ask in the forum below.

Links

History

  • October 29th 2005 - First release
  • November 1st 2005 - Corrected a couple of mistakes in the article

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer (Senior)
France France
I am a French programmer.
These days I spend most of my time with the .NET framework, JavaScript and html.

Comments and Discussions

 
AnswerRe: Anyone tried ? Pin
neolithos1-Nov-05 9:52
neolithos1-Nov-05 9:52 
GeneralRe: Anyone tried ? Pin
neolithos1-Nov-05 9:57
neolithos1-Nov-05 9:57 
GeneralRe: Anyone tried ? Pin
Pascal Ganaye1-Nov-05 11:57
Pascal Ganaye1-Nov-05 11:57 
AnswerRe: Anyone tried ? Pin
AndyHo2-Nov-05 1:17
professionalAndyHo2-Nov-05 1:17 
GeneralRe: Anyone tried ? Pin
Pascal Ganaye2-Nov-05 2:32
Pascal Ganaye2-Nov-05 2:32 
GeneralRe: Anyone tried ? Pin
AndyHo2-Nov-05 3:37
professionalAndyHo2-Nov-05 3:37 
GeneralRe: Anyone tried ? Pin
verdant28-Sep-06 1:50
verdant28-Sep-06 1:50 
GeneralOpenNLP performance (was Re: Anyone tried ?) Pin
Richard Northedge17-Nov-06 9:37
Richard Northedge17-Nov-06 9:37 
My .NET port of OpenNLP (SharpNLP) is driven from maximum entropy models read by SharpEntropy, a port of the Java MaxEnt library. The default configuration is for the model files to be read entirely into memory, then accessed from memory as needed. So for instance, loading all the models necessary for the OpenNLP Parser tool into memory takes about 12 seconds on my machine, but once the models are in memory, parsing a sentence is pretty much instantaneous. The problem with this approach is if your machine doesn't have enough RAM to hold the model files in memory. Then that 12 seconds turns into several minutes or more as physical memory is full and the machine starts using virtual memory on your hard disk instead.

I changed SharpEntropy's implementation of the model reader / writer interfaces so they are no longer identical to the Java MaxEnt interfaces. These changes make it possible to create reader / writer pairs that do not load the entire model into memory up front, but keep it on disk and access the parts of the model as needed. This scenario means that there is no initial load time hit, but each individual access of the model takes longer. when your machine doesn't have enough RAM to hold the model files in memory, this is probably a better bet. You'll need to use the overloads in OpenNLP that take in objects implementing SharpEntropy.IMaximumEntropyModel, rather than the methods that just take a path to a model file in the .NET "in memory" binary format.

Now, I have implemented model reader / writer classes that use SQL Server and SQLite as the format for holding model files. But a relational database is probably not the most efficient way to go. I envisage a binary format optimised for fast lookups of blocks of data based on a string key.

As far as the Spanish goes: I am slowly catching up with the changes made in the Java OpenNLP library version 1.3. Watch the SharpNLP space on Codeplex for updates.

Richard


AnswerRe: Anyone tried ? Pin
Fedy22-Nov-05 6:22
Fedy22-Nov-05 6:22 
GeneralRe: Anyone tried ? Pin
Pascal Ganaye2-Nov-05 6:42
Pascal Ganaye2-Nov-05 6:42 
GeneralRe: Anyone tried ? Pin
Fedy22-Nov-05 7:08
Fedy22-Nov-05 7:08 
GeneralRe: Anyone tried ? Pin
Pascal Ganaye3-Nov-05 9:14
Pascal Ganaye3-Nov-05 9:14 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.