Well designed object models increase reliability and code reuse. We all know that. But building and maintaining them can be a pain, especially in a changing environment. In this three-part series, we'll look at how rolling your own code generators can speed your development and take the fear and bite out of rapid refactoring.
In the first part, write a simple code generator using the .NET CodeDom classes. In parts two and three, we'll use our code generator to build entity classes from a database and demonstrate mid-cycle refactoring without major code rework.
In this, the first part in our three-part series, we'll be covering the basics of CodeDOM. If you've already spent lots of time reading from any of the many articles and references regarding CodeDOM and are fairly comfortable with it, you should skip ahead to Part 2 where we dig more into the meat of the matter at hand, although this article does contain a diatribe on the concept of code generation's place in application development.
CASE tools (Computer Aided Software Engineering) have long been the realm of structured, Process-oriented software development (and I do mean process with a capital P). After months of writing mind-numbingly boring Use Cases and long battles over the minutia of various OO theories and cryptic class diagrams (we need a proxy in front of that data adapter!), the keepers of the Rational Rose license would deign to generate our skeleton classes, so that we could then spend more months reworking them into something actually useful.
But there's another way to approach code generation. An aspect of the emerging Agile development methodologies is to recognize that software is never designed right the first time. The corollary to this is that we must be constantly refactoring the code. I'm not talking next release refactoring. I'm talking mid cycle, per iteration, full blown, deep tissue Refactoring, Oh who's you daddy! I'm talking the kind of refactoring where on Tuesday morning you notice a botched table structure in the database and by mid-afternoon coffee break you've restructured the entire data model and rebuilt and confirmed 10,000 lines of code. This isn't your daddy waiting a week for the punch card machine, my friend.
Sounds like magic? Sounds original? Hardly. There is an increasing recognition that even what used to be considered tricky tasks largely consists of repetitious code, and others may have begun floating similar ideas on code generation.
This series is an offshoot of my constant quest to evaluate and upgrade my own toolkit and code generators. And you, dear reader, are going to share in some of the journey.
Code generation approaches
There are two basic approaches to writing code generators: template based or inline building. My prior entity code generators were template based. Many of the Visual Studio code builders are template-based. For example, when you add a new Web Control, VS takes a copy of the file below and replaces a few tokens:
C:\Program Files\Microsoft Visual Studio .NET
Template based approaches tend to be easy to build and maintain since they are largely simple string parsing and token replacement once you have a base design.
And then there's inline building of code. Microsoft created an entire namespace for this:
System.CodeDom. I assume this is used by the various wizards, but have yet to confirm. The theory behind CodeDom is to abstract the various calls with interfaces and such so that, much like assemblies built on the framework, the generated code is neutral and can be modified to output in various languages. The basic theory is that many, if not most, code structures may be represented as a graph (similar to a tree structure). If you're interested, you can read a little more about it from Microsoft, but the theoretical documentation is fairly thin, instead focusing on practical implementation.
Nice theory, but a little crude as of now. But I suppose that's to be expected for a first try. And so much of your code has to be string literals that the whole theory of neutral code generation falls apart (as of this version). But don't be deterred, it is of worth and will likely be greatly improved in the coming versions.
Your first generated code
We'll start with everybody's favorite test program, Hello World. I think MSDN actually has a HelloWorld example, but we're going to do it a touch differently. We'll start by making a
HelloWorld class which will then be driven by an external application (so you can confirm it works). This is because our code generator will be dealing with building entity classes by the end of this, so a standalone application isn't what we're after.
I always find it most beneficial when writing code generators to create my end product first, make sure it compiles and works, then work backwards to implement the generator. Below is the class code we're after:
public class HelloWorld
private string name;
public virtual string Name
this.name = value;
public virtual string SayHello()
return "Hello, " + this.Name;
So now that we know the end product, let's fill in the road to it. The basic information and structure for our first code builder is taken from the article Microsoft .NET CodeDom Technology - Part 1 by Brian J. Korzeniowski. The basics are that we're going to write our class to a file stream.
Building your code generator
Start by creating a new Windows Application in Visual Studio and rename
CodeBuilder. At the top of your
CodeBuilder class, include the following assemblies (in addition to the default ones):
These are the basic namespaces that make up the built-in code generation classes (and the C# specific ones), plus one for IO. From them, we'll be using a relatively small subset.
Next, create a routine
BuildClass to be the main driving routine. Later, we'll wire it to a button to trigger the action. In our routine, we'll open a file stream, create the various classes and methods we want, write them to the stream, then close the stream.
The listing below shows our file stream and the various code generation classes that form the basics of our tool. In this case I'm using C#, but you could just as easily use VB by including the
Microsoft.VisualBasic namespace and the
VBCodeProvider instead. The provider exposes methods for obtaining a language-specific
ICodeCompiler to actually build your code snippets. The
CodeGeneratorOptions class provides a small level of formatting and stylistic control, but seems to be more a marker class for future features than anything else.
/// Main driving routine for building a class
/// <span class="code-SummaryComment"></summary>
string fspec = Application.ExecutablePath.Substring
fspec += @"\TestClass.cs";
Stream s = File.Open(fspec, FileMode.Create );
StreamWriter sw = new StreamWriter(s);
CSharpCodeProvider codeProvider = new CSharpCodeProvider();
ICodeGenerator generator = codeProvider.CreateGenerator(sw);
CodeGeneratorOptions codeOpts = new CodeGeneratorOptions();
So now we're ready to start building code. From our target sample code, we need to start with out namespace definition and namespace includes.
CodeCompileUnit unit = new CodeCompileUnit();
CodeNamespace myNamespace = new CodeNamespace("SayHello");
Next we have to create a
CodeTypeDeclaration for our class. All class-level variables and methods will be added to the
Members collection of this object.
CodeTypeDeclaration ctd = new CodeTypeDeclaration();
ctd.IsClass = true;
ctd.Name = "HelloWorld";
ctd.Attributes = MemberAttributes.Public;
And then we create our member variable, default constructor, and a simple property. I've created two convenience methods for this task:
SimpleProperty. This is done so that the member variable and property method can be created from a single call in the main
BuildClass method. Strangely enough, it also seems that the only way to do inline comments is via a literal
CodeSnippetStatement, which makes the code less portable as comment syntax varies.
CodeConstructor ccon = new CodeConstructor();
ccon.Attributes = MemberAttributes.Public;
CodeCommentStatement("Default Constructor for class", true));
CodeSnippetStatement("//TODO: implement default constructor"));
ctd.Members.Add(this.SimpleProperty("Name", "name", typeof(string)));
Our two convenience methods are:
CodeMemberField FieldVar(string fldName,
Type type, MemberAttributes accessLevel)
CodeMemberField fld = new CodeMemberField(type, fldName);
fld.Attributes = accessLevel;
CodeMemberProperty SimpleProperty(string propName,
string internalName, Type type)
CodeMemberProperty prop = new CodeMemberProperty();
prop.Name = propName;
CodeCommentStatement("Property comment for " + propName));
prop.Attributes = MemberAttributes.Public;
prop.Type = new CodeTypeReference(type);
prop.HasGet = true;
prop.HasSet = true;
As you can see from the
SimpleProperty routine, simple tasks can become incredibly complex when they are fully abstracted out. This routine in particular is one where I might be inclined to simply use the
CodeSnippetCompileUnit objects to write out the code literally, but that disallows you from generating the code in other supported languages (like VB). But I digress.
Our final code is the method to return a cheery greeting.
CodeMemberMethod myMethod = new CodeMemberMethod();
myMethod.Name = "SayHello";
myMethod.ReturnType = new CodeTypeReference(typeof(string));
CodeCommentStatement("Returns a happy greeting", true));
myMethod.Attributes = MemberAttributes.Public;
CodeSnippetExpression("\"Hello, \" + this.Name")));
This simple method created some of the most trouble for me of all of them, if you can believe that. CodeDOM is a very simplistic, bare-bones framework for the basic operations a programmer will encounter. Unfortunately, string concatenation actually works by overloading the addition operator, making it somewhat an oddball case, despite being one of the most common operations for many types of programming. I tried to overcome this by stringing a series of statements together into an assignment, but this is not currently supported. I could have used a
StringBuilder, calling its
Append method, but that was just too many contortions for my comfort.
Our final task is to add our class to our namespace and generate our code, writing to the file stream (and, of course, clean up).
generator.GenerateCodeFromNamespace(myNamespace, sw, codeOpts);
Attach a call to your
BuildClass method from a button and voila! Push button code building.
If you want to test your generated class, you'll need to create a new Windows or Web app, include a reference to your class and namespace in the form code, and utilize its public methods. Below is a very simple sample of usage:
private void button1_Click(object sender, System.EventArgs e)
HelloWorld hw = new HelloWorld();
hw.Name = this.textBox1.Text;
textBox2.Text = hw.SayHello();
While CodeDOM is currently not complete enough to build all the tight and robust methods one might wish for, it is an interesting exploration into how to represent and generate repetitious code in a programmatic manner. Compound statements and string building seem to be beyond scope for this release of CodeDOM. I was forced to resort to simple snippet use to get what I wanted. I found its abstracted methods for representing common programming tasks to be much more difficult than simply writing templates and performing token replacement. It doesn't stop one from writing out literal code when needed or convenient, so the limitations are not insurmountable. It also does a fair job of code formatting and indentation, which I have found to be endlessly tedious and error prone in the templated systems I've built.
All that said, I am going to continue using CodeDOM for Part 2. I suspect support will improve with the .NET 2.0 Framework and Visual Studio 2005. A good reference source for CodeDOM objects is the CodeDOM Quick Reference in the MSDN library. Should you use CodeDOM? Read Part 2 to find out.
Preview Part II
In Part II, we'll use our code building knowledge to write a tool to auto-generate entity and factory classes by reading a SQL Server database. Have your NorthWind sample installed!