|
Introduction
CS CODEDOM Parser is utility which parses the C# source code and creates the CODEDOM tree of the code (general classes that represent code, part of .NET Framework - namespace System.CodeDom) .
Current version (0.1) is limited - it parses code down to type members and their parameters, it has very limited support for expressions and it does not parse the statements inside members. There are two main reasons for why I stayed at this level now
- First - It was enough for my needs (I wanted to do some code analysis to enforce coding standards)
- Second - CODEDOM is limited and cannot express fully the C# code - for more details see section CODEDOM Limitations below.
On the other hand it also parses source code comments, so it can be used to analyze the interdependencies of code and comments.
Also the stability of this version is low - it's kind of alpha version. If anybody wants to help get this thing further he is welcomed.
The parser is based on Mono - CSharp Compiler code. I was looking around little bit around for available C# parser and C# parser building tools (I wanted C# parser in C#) and finally decided for Mono. For more details about exploitation of Mono parser and other possibilities I explored see section C# parser Tools.
At first I thought it is great idea to use language independent syntax tree and CodeDom looks nice. If some code analysis tool is build on it, it can work for any .NET language. Just need to change parser and rest is the same, sounds cool. But, after I've got into the CodeDom, I have found that a lot of language features (and not just C#, basically for any language) is missing and it is not possible to parse the source code fully. The main problem is in expressions and statements, where CodeDom has very limited set of classes - there is for instance no support for unary operation and more more issues.
I decided to continue with CodeDom, even with its limitations, because it was enough for purposes of analyzing code for coding standards (at least what I need now - it also enables to keep comments and code in one tree, which is something I liked), but it is open issue for the future development.
Here is list of issues I've found (and there is more,):
- CodeCompile unit does not have space for using directives or ns members, so they are placed now into first default NS
- using_alias_directive - no support found
- nested namespaces - no support found ( so parser is flattening ns hierarchy)
- variable declaration list (int i,j,k;) - no support - transformed to individual var declarations
- pointer_type - no support found
- "jagged" array type (array of arrays) - MS CSharpCodeProvider reverses order of ranks
- params keyword - not supported - param is omitted in parsing and param is then an ordinary array type param
- private modifier on nested delegate is not shown by CSharpCodeProvider (all other nested types works fine)
- unsafe modifier - no support found
- readonly modifier - no support found
- volatile modifier - no support found
- explicit interface implementation - not implemented yet (I think this can be done)
- add and remove accessors for Event - no support found
- virtual and override modifiers do not work in MS CSharpCodeProvider for events
- Operator members and Destructors - no support found
- Expressions - no unary expressions(operations) at all !!!, only one dim arrays, some operators not supported and more
- Attribute targets : no support found
- Attributes on accessor : no support found
- If CompileUnit contains custom attributes in global scope, CSSharpCodeProvider prints then before global using directives (it is due to that using has to be in the first ns)
I wanted to use some existing tool so I looked around and found this interesting stuff :
- Mono project
They are implementing a complete open source .NET platform (they modified jay parser generator and used it to generate the parser).
-
Compiler Writing Tools using C#, from Malcolm Crowe of the University of Paisley Mr.Crowe creates parser and lexer generator in C#. I was playing with these tools quite a bit, but when I wanted to do something bigger, I've got stuck.
-
C# grammar for flex/bison written by James Power of National University of Ireland Contains scripts for well-known tools bison and flex, which can generate C parser. I thought I can use then in some C# port of those tools, but I was not able, so finally used the grammar from Mono.
-
jb2csharp This is port of JB Parser and Lexer Generation for Java (which itself is port of bison and flex). But the current version is alpha and I was not able to make work even their calculator example (which authors claim it was working).
-
CsLex from Brad Merrill It is a lexer generator.
-
I've also looked at the MS Rotor project, the C# parser there is in C++ (and it is not Open Source license).
So finally I decided to use Mono source, I've used their lexer, jay and their jay grammar to generate my parser. It is the jay grammar I've use my code to create CodeDom objects.
Description of package
CS CODEDOM Parser package consist of :
- CodeDom parser itself (/ directory)
- NUnit tests for the parser (/NUnitTests directory)
Contains bunch of tests, I've used to check functionality of the parser - if you want to run then you should have NUnit.
- testParser (/testParser directory)
Simple command line utility that tests the parser - it parses file (name supplied as cmd line parameter) and write to stdout the code, which is generated by CSharpCodeProvider (class in CodeDom).
- CodeTreeView (/CodeTreeView directory)
Simple windows application, which opens file and displays CODEDOM tree in left part (treeview control) and original source in right part (textbox control). When you click on tree node, textbox scrolls to show the code. It is something like very very simple source code viewer.
Licence
CS CODEDOM Parser and tools included in this package are distributed under the under GPL licence.
Latest Version
You can check for latest version on http://ivanz.webpark.cz/csparser.html.
The Future
The basic idea about future development is to extend CodeDom to support all language features, so the sources can be completely parsed. (Alternative is to leave CodeDom and have its own syntax tree, but I still like the idea of the independent language tree structure, which can be used in different tasks).
Reporting of errors and warnings should be improved (unify codes and messages, unify error reporting, Report class should store reported errors).
Also parser should be improved to indicate location of syntax elements more exactly in the source file.
Better separation between the parser and CODEDOM builder is also needed.
If somebody likes the tool and wants to help with its improvements, he is welcome.
| You must Sign In to use this message board. |
|
| | Msgs 1 to 22 of 22 (Total in Forum: 22) (Refresh) | FirstPrevNext |
|
|
 |
|
|
Great application, however, I was wondering if you may write a little bit in the article about how the whole thing works together. I understand you have full source, but I am having trouble just seeing how all the pieces (especially jay and mono specific components) work together.
Also, could you please update your link that is supposed to point to the latest code, it is a dead link for me every time I click on it.
Thanks.
R.Bischoff
Tengas un buen dia
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
Look at this parser with testing of grammars: www.intralogic.eu
Testing your code against performance will keep you running with good scalability and maintenance for years. NTime.exe - the free tool for real developers of high scalability applications!
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
Is there an update for this code that will work for C# 2.0 syntax e.g. that would handle generics?
|
| Sign In·View Thread·PermaLink | 1.00/5 (3 votes) |
|
|
|
 |
|
|
Microsoft really need to improve CodeDOM and provide proper support, I am doing a project for Refractor and it sucks like hell, however I will try to use your code, you have really done good job. Microsoft purposely kept CodeDom building from source code away from our reach so that no one can build smart IDE as VS.
In my whole life, to parse lines of 100 lines, I always had to write code of 10000000 lines... parsing sucks, I wish there could be easy way of parsing. YACC LEX etc tools are so beyond normal application programming.
I tried to use ISharpDev but it failed because of switch case, and switch is important, we can not expect users to write complicated if statements for us instead of switch.
What license your code is? We might use in our LGPL licensed project if it works, however i am trying to convert limited CS code to JavaScript or such small script code which anyway fits in your limited list. Does your code support Switch statement?
Programming is fun. -Akash Kava
|
| Sign In·View Thread·PermaLink | 4.20/5 (2 votes) |
|
|
|
 |
|
|
Ask a CodeDOM question. Do you know CodeDOM how to generator "break","continue" etc. codestatement? use "goto" replace them?
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
It gives me error 1517 "Invalid Pre-processor directive" on the following line in my code,
#if (VAR1 && VAR2) // do something #endif
Now, if I put it in the following form it works fine:
#if VAR1 #if VAR2 //do something #endif #endif
Also, if I use something like :
#if (VAR) //do something #endif
it gives me error 1040 or 1024 but if I remove thr brackets it works fine i.e.
#if VAR //do something #endif
I wanted to know if this is a bug in the CODEDOM parser. Did anyone come across this earlier?
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
Hello everybody;
i have graduation project of Refactoring & i need parser or anything that generate parse tree ,and another thing to convert from parse tree again to code.
i tried to use mono parse but i need help with mono for windows ,i downloaded the package but i couldn't run it ,many errors exist bcs some files missed,is anybody has any advises may help me OR
may be u can help me by advising another thing than mono, please if u can help me or suggest anything reply to me. thx Ghada
|
| Sign In·View Thread·PermaLink | 1.00/5 (1 vote) |
|
|
|
 |
|
|
I'd like to compile RPL language that targets a non Windows platform. Can I use CodeDOM to make the lexical analysis of the source file?
thanks.
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
Just wanted to say you have done an excellent job at producing something here. None of the obstacles you had to overcome were easy, and the final article / code is very valuable. It's pretty funny how you already have 2 people mouthing off, shooting from the hip, telling you in a harsh manner that you screwed up, when clearly they did not think at all before posting. Thought you might like to hear that some of us can recognize good work when we see it.
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
Microsoft.CSharp.CSharpProvider provides
ICodeParser ICodeCompiler ICodeGenerator
Similarly, for VBCodeProvider and JScriptCodeProvider.
What is your article trying to accomplish that isn't handle by the frameworks already?
Thanks, Wes
|
| Sign In·View Thread·PermaLink | 1.00/5 (4 votes) |
|
|
|
 |
|
|
 |
|
|
Wesner Moise wrote: Re: The framework already has a CodeDOM parser
Lame Moise. Sooooo lame. Big time! You should have tried it, before you flamed the guy!
But I guess you were right. The interfaces are indeed provided, the implementation, however, is missing.
Wesner Moise wrote: What is your article trying to accomplish that isn't handle by the frameworks already?
I guess most of the implementation, ain't I right?
Whatever, nice job Ivan!
Cheers,
Stoyan
Science may never come up with a better office-communication system than the coffee-break (Earl Wilson)
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
Did anyone actually read the MSDN Documentation for the CSharpCodeProvider ?
The CreateParser() method explicitly says "When implemented in a derived class, creates a new code parser."
What's to argue about?
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
If you use the .NET Reflector you will find that CSharpCodeProvider. CreateParser() returns null.
|
| Sign In·View Thread·PermaLink | 4.25/5 (4 votes) |
|
|
|
 |
|
|
CodeDom does not support a lot of C# syntax (and general languange syntax). Lets reply on Wesners's comments
build-in C# parser? Which one? .NET distribution (1.0) does not include any, that creates CodeDom model. The parser used for C# compilation does not use CodeDom (of course)- look at sscli/rotor).
CodeDOm does support properties & events and constructors indeed, Using low level reprepsentation of these constructs (like add_xxx remove_xxx) is of cource imposible in CodeDom, because generated source code than will not contain properties, events but just strangelly named functions, I doubt that it will be compileable.
Nested namespaces can be of course splitted (that's also what I've done), but the source code that will be look diffent that is your intent.
Jagged arrays are possible, I've just found that if your are generating code code generator, order of array indexers is opposite than should be.
Explicit interface implementation - again I want to have it in source code, I do not care how it is implemented.
Unary expersssion - there are missing completly and writting them as binary expression in no solution, because generated code looks then like crap. What about unary negation? Concerning modifiers - I've checked the parameters in detail I did not find some (like readonly), if you know how to implement them in COdeDom let me know.
There are also other gaps, which include - no support for while, foreach statements.
In my oppinion CodeDom serves as general model for source code and is used in different tools which generates (or parses) source code so I'm interested in the way how C# source can be represented in CodeDom and I found problems metioned above , fact that CLR implements this feature in some way is really not relevant here, it source code, that is important for CodeDom and if I'm not able to generate or parse any C# code (maybe with exception of unsafe code and operators overriding, which are more "unusual" features) I have problems with CodeDom.
Ivan
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
Wesner Moise wrote: Check out CSharpProvider.CreateParser in the Microsoft.CSharp namespace. IParser will parse any file and generate a CodeDom graph.
If you actually tried it, you would have notice that function returns null.
Wesner Moise wrote: You might need to read up on it a little more and possibly examine the create the the built-in parser produces to discover what's missing.
...
public class CSharpCodeProvider : CodeDomProvider { public CSharpCodeProvider(); public override ICodeCompiler CreateCompiler(); public override ICodeGenerator CreateGenerator(); public override TypeConverter GetConverter(Type type); public override string FileExtension { get; } }
public abstract class CodeDomProvider : Component { protected CodeDomProvider(); public abstract ICodeCompiler CreateCompiler(); public abstract ICodeGenerator CreateGenerator(); public virtual ICodeGenerator CreateGenerator(string fileName); public virtual ICodeGenerator CreateGenerator(TextWriter output); public virtual ICodeParser CreateParser(); public virtual TypeConverter GetConverter(Type type); public virtual string FileExtension { get; } public virtual LanguageOptions LanguageOptions { get; } }
//From the SS CLI public virtual ICodeParser CreateParser() { return null; }
//From Anakrino public virtual ICodeParser CreateParser() { return null; }
//and finally the IL CodeDomProvider.CreateParser
.maxstack 8 L_0000: ldnull L_0001: ret
Now you telling me (and the author of this article) that somehow you magically gotten an ICodeParser from somewhere? A bit fishy for someone who has 2 very indepth articles...
MyDUMeter: a .NET DUMeter clone
|
| Sign In·View Thread·PermaLink | 4.00/5 (2 votes) |
|
|
|
 |
|
|
I agree that CodeDom supports most common syntactical constructs. My point is definitely not that CodeDom is bad. It’s good way to general code model for .NET, but I was missing some features – some may be specific to C#, others are rather general (I still think that unary expression is enough general concept, which should be included).
I think that Wesner arguments came from different view on the CodeDome purpose. I also think that Wesner mixes two thing together – One thing is CodeDom and generating SOURCE CODE from it (that is, I think, primary purpose why CodeDom was created), another thing is generating IL that will be functionally equivalent to given source , CodeDom is about SOURCE CODE!!! If we need to generate IL code from program, there is more effective way how to do it using Reflection.Emit. (When you generate IL from Codedom, source code is generated first into file and than it is compiled.)
My interest was more about using CodeDom in “wizard like” tools, where generated source code is presented to programmers, so it should look tidy.
“CodeDomIterationStatement is used for both for and while”
First the class is CodeIterationStatement, second when you generate C# source (using CSharpCodeGenerator) you always get for statement from this class – see code extract from CSharpCodeGenerator :
/// /// /// Generates code for the specified CodeDom based for loop statement /// representation. /// /// protected override void GenerateIterationStatement(CodeIterationStatement e) { forLoopHack = true; Output.Write("for ("); GenerateStatement(e.InitStatement); Output.Write("; "); GenerateExpression(e.TestExpression); Output.Write("; "); GenerateStatement(e.IncrementStatement); Output.Write(")"); OutputStartingBrace(); forLoopHack = false; Indent++; GenerateStatements(e.Statements); Indent--; Output.WriteLine("}"); } (one thing is what is written in documentation, another thing is when you want to use it)
“But you can parse C# code into CodeDom. And you did not even have to write a parser for it, since one was already provided with Microsoft.CSharp.CSharpProvider.CreateParser. By the way operator overloading is supported.”
”CodeDom, I think is fine as is. It's intended to make it easy for anyone to write their own compiler and also to allow cross-language code generation. “
”You might need to read up on it a little more and possibly examine the create the built-in parser produces to discover what's missing.”
I’ve indeed checked CSharpProvider.CreateParser a long time ago (I thing it was with Beta2) and found what was already mentioned in this discussion – it returns null. There is no CodeDom CSharp parser in the framework, unless there in something new in 1.1. I thing some of reasons why it is not included are related to what I’ve already written – CodeDom is little bit problematic to represent all language features, so it may not parse all source files or parsed syntax tree will different from the code.
I’ve spend some time with CodeDom, read docs, looked into its source code and played with it, what I was presenting were my PRACTICAL experiences, when I wanted CodeDom to represent C# SOURCE CODE with acceptable level of accuracy.
Ivan
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
 |
|
|
Wesner Moise wrote: Ok... I haven't played with CodeDom much.
Much?! Have you played with it at all?
Wesner Moise wrote: I didn't realize it was very incomplete.
Very incomplete?! I'd say totally incomplete, but whatever...
You know Moise, if you want to look competent in the other guys' eyes, you should always double-check what you write, ok? (In my eyes, you look like a big baloon, with a slogan "Made in Microsoft").
Cheers,
Stoyan
P.S.
Active Channel, Active Desktop, Active Directory, ActiveStore, ActiveSync, ActiveX, Advisor FYI, Age of Empires, Age of Mythology, Allegiance, Amped, Asheron's Call, Ask Maxwell, Authenticode, Azurik, BackOffice, BackOffice logo, bCentral, BizTalk, Bookshelf, CarPoint, ClearLead, Computing Central, Crimson Skies, Developer Studio, DirectDraw, DirectMusic, DirectPlay, DirectSound, DirectX, Encarta, Entourage, Fighter Ace, FrontPage, HomeAdvisor, Home Essentials, Hotmail, Links, Links Extreme, MapPoint, MechCommander, MechWarrior, Microsoft, Microsoft Agent logo, Microsoft Internet Explorer logo, Microsoft Office Compatible logo, Microsoft Press, Microsoft TV logo, Midtown Madness, Mobile Explorer, MoneyCentral, Monster Truck Madness, Motocross Madness, MSDN, MSN, MSN logo (butterfly), .Net logo, NetMeeting, Nightcaster, Outlook, Outsmart, Passport logo, Picture It!, PowerPoint, Precision Racing, Project Gotham Racing, Revenge of Arcade, Rise of Perathia, SharePoint, Slate, Tex Murphy, The Age of Kings, The Everyday Web, Trekker, UltimateTV, UltimateTV logo, UltraCorps, UnderWire, Urban Assault, VGA, Virtual Golf Association, Visio, Visual Basic, Visual C++, Visual C#, Visual InterDev, Visual J++, Visual Studio, WebTV, Where do you want to go today?, Windows, Windows logo, Windows Media, Windows Media logo, Windows NT, Xbox, XBOX logo, Xbox "X" logo, ZoneFriends, ZoneLAN, ZoneMatch, ZoneMessage, Zoo Tycoon, and/or other Microsoft products referenced herein are either registered trademarks or trademarks of Microsoft Corporation in the U.S. and/or other countries. The names of actual companies and products mentioned herein may be the trademarks of their respective owners. The example companies, organizations, products, domain names, email addresses, logos, people and events depicted herein are fictitious. No association with any real company, organization, product, domain name, e-mail address, logo, person, or event is intended or should be inferred.
Science may never come up with a better office-communication system than the coffee-break (Earl Wilson)
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
The distribution as it stands is not compilable and will require you to look around for stuff. It also does not contain functioning binaries.
Please fix.
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
Ben, I've check it, and whole package compiles without any problem in VS.NET, I've just downloaded it, opened solution in VS and build solution. Could you please be more specific about problems? Buch of people tried it, but nobody complains about problems with compilation.
Also more recent version are available from http://ivanz.webpark.cz/csparser.html or from http://www.sweb.cz/ivan.zderadicka/csparser.html. (Some bugs were fixed and code was little bit reorganized).
The most up-to date version can be grabed from CVS on sourceforge, where this project is also hosted - http://sourceforge.net/projects/cscodedomparser/
The package is distributed in source just in source, because in my opinion, that what most people is interested anyway. (and by the way debug compilations are in the zip, in standard VS place - bin/debug)
Regards
Ivan
|
| Sign In·View Thread·PermaLink | |
|
|
|
 |
|
|
I was also thinking on the same line but with the VB.NET language + EnvDTE Namespace, which makes a little more easier to parse the Function / Classes / Namespace etc., so the only eloborate parsing we need to do is with the expressions and statements but even then I ran into a problem wherein VB.NET uses parenthesis [() ] for both function calls and arrays so I am not able to distinguish between the two 
But its a good start for you in C# ! Keep up!!
Krishna
|
| Sign In·View Thread·PermaLink | 4.00/5 (3 votes) |
|
|
|
 |
|
|
General News Question Answer Joke Rant Admin
|