Rapid development and refactoring using code generators - Part 1

Chris Cole

4.77/5 (12 votes)

Jul 29, 2005

9 min read

52416

358

Using CodeDOM and SQL Server to build entity and factory classes - Part 1: CodeDOM basics.

Simple Code Generator and generated source

Introduction

Well designed object models increase reliability and code reuse. We all know that. But building and maintaining them can be a pain, especially in a changing environment. In this three-part series, we'll look at how rolling your own code generators can speed your development and take the fear and bite out of rapid refactoring.

In the first part, write a simple code generator using the .NET CodeDom classes. In parts two and three, we'll use our code generator to build entity classes from a database and demonstrate mid-cycle refactoring without major code rework.

Part 1 - Generating classes with CodeDom
Part 2 - Using code generators to build entity and factory classes from a database
Part 3 - Subclassing to allow rapid refactoring

In this, the first part in our three-part series, we'll be covering the basics of CodeDOM. If you've already spent lots of time reading from any of the many articles and references regarding CodeDOM and are fairly comfortable with it, you should skip ahead to Part 2 where we dig more into the meat of the matter at hand, although this article does contain a diatribe on the concept of code generation's place in application development.

Background

CASE tools (Computer Aided Software Engineering) have long been the realm of structured, Process-oriented software development (and I do mean process with a capital P). After months of writing mind-numbingly boring Use Cases and long battles over the minutia of various OO theories and cryptic class diagrams (we need a proxy in front of that data adapter!), the keepers of the Rational Rose license would deign to generate our skeleton classes, so that we could then spend more months reworking them into something actually useful.

But there's another way to approach code generation. An aspect of the emerging Agile development methodologies is to recognize that software is never designed right the first time. The corollary to this is that we must be constantly refactoring the code. I'm not talking next release refactoring. I'm talking mid cycle, per iteration, full blown, deep tissue Refactoring, Oh who's you daddy! I'm talking the kind of refactoring where on Tuesday morning you notice a botched table structure in the database and by mid-afternoon coffee break you've restructured the entire data model and rebuilt and confirmed 10,000 lines of code. This isn't your daddy waiting a week for the punch card machine, my friend.

Sounds like magic? Sounds original? Hardly. There is an increasing recognition that even what used to be considered tricky tasks largely consists of repetitious code, and others may have begun floating similar ideas on code generation.

This series is an offshoot of my constant quest to evaluate and upgrade my own toolkit and code generators. And you, dear reader, are going to share in some of the journey.

Code generation approaches

There are two basic approaches to writing code generators: template based or inline building. My prior entity code generators were template based. Many of the Visual Studio code builders are template-based. For example, when you add a new Web Control, VS takes a copy of the file below and replaces a few tokens:

C:\Program Files\Microsoft Visual Studio .NET 
       2003\VC#\VC#Wizards\CSharpAddWebControlWiz\Templates\1033

Template based approaches tend to be easy to build and maintain since they are largely simple string parsing and token replacement once you have a base design.

And then there's inline building of code. Microsoft created an entire namespace for this: System.CodeDom. I assume this is used by the various wizards, but have yet to confirm. The theory behind CodeDom is to abstract the various calls with interfaces and such so that, much like assemblies built on the framework, the generated code is neutral and can be modified to output in various languages. The basic theory is that many, if not most, code structures may be represented as a graph (similar to a tree structure). If you're interested, you can read a little more about it from Microsoft, but the theoretical documentation is fairly thin, instead focusing on practical implementation.

Nice theory, but a little crude as of now. But I suppose that's to be expected for a first try. And so much of your code has to be string literals that the whole theory of neutral code generation falls apart (as of this version). But don't be deterred, it is of worth and will likely be greatly improved in the coming versions.

Your first generated code

We'll start with everybody's favorite test program, Hello World. I think MSDN actually has a HelloWorld example, but we're going to do it a touch differently. We'll start by making a HelloWorld class which will then be driven by an external application (so you can confirm it works). This is because our code generator will be dealing with building entity classes by the end of this, so a standalone application isn't what we're after.

I always find it most beneficial when writing code generators to create my end product first, make sure it compiles and works, then work backwards to implement the generator. Below is the class code we're after:

using System;

namespace SayHello
{
    public class HelloWorld
    {
        private string name;
        public HelloWorld()
        {
        }

        public virtual string Name
        {
            get
            {
                return name;
            }
            set
            {
                this.name = value;
            }
        }

        public virtual string SayHello()
        {
            return "Hello, " + this.Name;
        }

    }
}

So now that we know the end product, let's fill in the road to it. The basic information and structure for our first code builder is taken from the article Microsoft .NET CodeDom Technology - Part 1 by Brian J. Korzeniowski. The basics are that we're going to write our class to a file stream.

Building your code generator

Start by creating a new Windows Application in Visual Studio and rename Form1 to CodeBuilder. At the top of your CodeBuilder class, include the following assemblies (in addition to the default ones):

using System.CodeDom;
using Microsoft.CSharp;
using System.CodeDom.Compiler;
using System.IO;

These are the basic namespaces that make up the built-in code generation classes (and the C# specific ones), plus one for IO. From them, we'll be using a relatively small subset.

Next, create a routine BuildClass to be the main driving routine. Later, we'll wire it to a button to trigger the action. In our routine, we'll open a file stream, create the various classes and methods we want, write them to the stream, then close the stream.

The listing below shows our file stream and the various code generation classes that form the basics of our tool. In this case I'm using C#, but you could just as easily use VB by including the Microsoft.VisualBasic namespace and the VBCodeProvider instead. The provider exposes methods for obtaining a language-specific ICodeGenerator and ICodeCompiler to actually build your code snippets. The CodeGeneratorOptions class provides a small level of formatting and stylistic control, but seems to be more a marker class for future features than anything else.

/// <summary>

/// Main driving routine for building a class

/// </summary>

void BuildClass()
{
    string fspec = Application.ExecutablePath.Substring
        (0, Application.ExecutablePath.LastIndexOf(@"\"));

    fspec += @"\TestClass.cs";

    Stream s = File.Open(fspec, FileMode.Create );
    StreamWriter sw = new StreamWriter(s);

    CSharpCodeProvider codeProvider = new CSharpCodeProvider();
    ICodeGenerator generator = codeProvider.CreateGenerator(sw);
    CodeGeneratorOptions codeOpts = new CodeGeneratorOptions();

So now we're ready to start building code. From our target sample code, we need to start with out namespace definition and namespace includes.

    //Create a compile unit for the namespace and imports for your new code

    CodeCompileUnit unit = new CodeCompileUnit();
    CodeNamespace myNamespace = new CodeNamespace("SayHello");
    myNamespace.Imports.Add(new CodeNamespaceImport("System"));

Next we have to create a CodeTypeDeclaration for our class. All class-level variables and methods will be added to the Members collection of this object.

    //Build the class declaration and member variables

    CodeTypeDeclaration ctd = new CodeTypeDeclaration();
    ctd.IsClass = true;
    ctd.Name = "HelloWorld";
    ctd.Attributes = MemberAttributes.Public;

And then we create our member variable, default constructor, and a simple property. I've created two convenience methods for this task: FieldVar and SimpleProperty. This is done so that the member variable and property method can be created from a single call in the main BuildClass method. Strangely enough, it also seems that the only way to do inline comments is via a literal CodeSnippetStatement, which makes the code less portable as comment syntax varies.

    ctd.Members.Add(this.FieldVar("name", 
               typeof(string), MemberAttributes.Private));

    //default constructor

    CodeConstructor ccon = new CodeConstructor();
    ccon.Attributes = MemberAttributes.Public;
    ccon.Comments.Add(new 
      CodeCommentStatement("Default Constructor for class", true));
    ccon.Statements.Add(new 
      CodeSnippetStatement("//TODO: implement default constructor"));
    ctd.Members.Add(ccon);

    //property

    ctd.Members.Add(this.SimpleProperty("Name", "name", typeof(string)));

Our two convenience methods are:

    CodeMemberField FieldVar(string fldName, 
                       Type type, MemberAttributes accessLevel)
    {
        CodeMemberField fld = new CodeMemberField(type, fldName);
        fld.Attributes = accessLevel;
        return fld;
    }

    CodeMemberProperty SimpleProperty(string propName, 
                            string internalName, Type type)
    {
        CodeMemberProperty prop = new CodeMemberProperty();
        prop.Name = propName;
        prop.Comments.Add(new 
             CodeCommentStatement("Property comment for " + propName));
        prop.Attributes = MemberAttributes.Public;
        prop.Type = new CodeTypeReference(type);
        prop.HasGet = true;
        prop.GetStatements.Add(
            new CodeMethodReturnStatement(
                new CodeFieldReferenceExpression(new 
                    CodeThisReferenceExpression(), internalName)));

        prop.HasSet = true;
        prop.SetStatements.Add(
            new CodeAssignStatement(
                new CodeFieldReferenceExpression(new 
                    CodeThisReferenceExpression(), internalName),
                new CodePropertySetValueReferenceExpression()));

        return prop;
    }

As you can see from the SimpleProperty routine, simple tasks can become incredibly complex when they are fully abstracted out. This routine in particular is one where I might be inclined to simply use the CodeSnippetExpression or CodeSnippetCompileUnit objects to write out the code literally, but that disallows you from generating the code in other supported languages (like VB). But I digress.

Our final code is the method to return a cheery greeting.

    //Our SayHello method

    CodeMemberMethod myMethod = new CodeMemberMethod();
    myMethod.Name = "SayHello";
    myMethod.ReturnType = new CodeTypeReference(typeof(string));
    myMethod.Comments.Add(new 
       CodeCommentStatement("Returns a happy greeting", true));
    myMethod.Attributes = MemberAttributes.Public;
    myMethod.Statements.Add(
        new CodeMethodReturnStatement(new 
            CodeSnippetExpression("\"Hello, \" + this.Name")));

    ctd.Members.Add(myMethod);

This simple method created some of the most trouble for me of all of them, if you can believe that. CodeDOM is a very simplistic, bare-bones framework for the basic operations a programmer will encounter. Unfortunately, string concatenation actually works by overloading the addition operator, making it somewhat an oddball case, despite being one of the most common operations for many types of programming. I tried to overcome this by stringing a series of statements together into an assignment, but this is not currently supported. I could have used a StringBuilder, calling its Append method, but that was just too many contortions for my comfort.

Our final task is to add our class to our namespace and generate our code, writing to the file stream (and, of course, clean up).

    //write code

    myNamespace.Types.Add(ctd);
    generator.GenerateCodeFromNamespace(myNamespace, sw, codeOpts);
    sw.Flush();

    sw.Close();
    s.Close();

Attach a call to your BuildClass method from a button and voila! Push button code building.

If you want to test your generated class, you'll need to create a new Windows or Web app, include a reference to your class and namespace in the form code, and utilize its public methods. Below is a very simple sample of usage:

    private void button1_Click(object sender, System.EventArgs e)
    {
        HelloWorld hw = new HelloWorld();
        hw.Name = this.textBox1.Text;
        textBox2.Text = hw.SayHello();
    }

Conclusion

While CodeDOM is currently not complete enough to build all the tight and robust methods one might wish for, it is an interesting exploration into how to represent and generate repetitious code in a programmatic manner. Compound statements and string building seem to be beyond scope for this release of CodeDOM. I was forced to resort to simple snippet use to get what I wanted. I found its abstracted methods for representing common programming tasks to be much more difficult than simply writing templates and performing token replacement. It doesn't stop one from writing out literal code when needed or convenient, so the limitations are not insurmountable. It also does a fair job of code formatting and indentation, which I have found to be endlessly tedious and error prone in the templated systems I've built.

All that said, I am going to continue using CodeDOM for Part 2. I suspect support will improve with the .NET 2.0 Framework and Visual Studio 2005. A good reference source for CodeDOM objects is the CodeDOM Quick Reference in the MSDN library. Should you use CodeDOM? Read Part 2 to find out.

Preview Part II

In Part II, we'll use our code building knowledge to write a tool to auto-generate entity and factory classes by reading a SQL Server database. Have your NorthWind sample installed!