Click here to Skip to main content
15,880,972 members
Articles / Programming Languages / XML
Article

TransformXML - a command line utility to apply XSL transforms

Rate me:
Please Sign up or sign in to vote.
3.59/5 (4 votes)
27 Nov 20068 min read 57K   540   23   6
A command line utility wrapping the XslCompiledTransform class.

TransformXml command line options

Contents

Introduction

TransformXml is a command line utility which applies an XSL transform file to an XML input file to generate an output file.

It uses the XslCompiledTransform class (introduced in .NET 2.0) to do this.

It includes support for:

  • passing parameters to the XSL transform file.
  • extending XSL with custom functions by passing assembly and class names in the command line (the instantiated class is known as an "extension object").
  • optionally running in XSL debug mode (to enable stepping into the XSLT in the Visual Studio 2005 IDE).
  • optionally disabling features which may introduce security threats, such as scripting, accessing external resources, and calling the document() function.

The utility extends XslCompiledTransform's extension object functionality by also allowing static classes to be passed as extension objects. This is achieved by using CodeDOM to dynamically generate an instantiable class whose methods call the corresponding methods of the static class.

This allows methods of classes such as the static File class to be called from the XSL transform file.

A discussion of the uses for XSLT and TransformXml will take place in a follow-up article.

Background

Some time back, a colleague created a daily build process using FinalBuilder. As well as building the executable, this also tested the SQL deployment scripts on a recent copy of each customer's database.

His original intention was to do a simple check for SQL errors in each output file. All output files with errors would be e-mailed to all developers, who would need to open each file, search for the error messages, see if the errors were in any of their change scripts, and fix those that were.

This would have been a pain for developers and might have increased resistance to the daily build process. So I decided to tackle this problem as a "hobby project".

My solution was to write three general purpose command line utilities, then use them to generate and e-mail a personalized error report to each affected developer.

The three utilities were:

  • RegexToXml: to parse the SQL output files for errors and warnings and output the results as a separate XML file for each database.
  • TransformXml: XSL transforms were written to:
    • Generate various batch files piecing together the process (batch files with a command per customer or developer were generated from central Databases.xml and Developers.xml files.)
    • Concatenate the XML files for the various customer databases.
    • Re-order the XML nodes by developer, then by database, then by change script.
    • Generate an HTML file per developer with details of the errors in that developer's change scripts (grouped by database).
  • SendSMTP: to e-mail the HTML file (as the body of the e-mail) to each affected developer.

I will be posting separate articles on each of these utilities, as well as a final article which will demonstrate the entire process in action. This final article will contain a deeper discussion of XSLT, with tips and snippets.

Using the code

The structure of the application

The application consists of a single executable, TransformXml.exe.

I did not write the application in a multi-layered fashion. The utility is a small wrapper around .NET's XslCompiledTransform class. So in a sense, the .NET base class libraries are the "business logic layer" of the application!

Because this is such a small utility, I also didn't bother to create any unit tests for it. At times, I have regretted this decision!

Here's the class diagram for the project...

Class diagram for TransformXml

Command line switches

The command line utility uses a hyphen followed by a single character to specify a command line switch. The related setting can either be appended to the switch, or passed as the next command line argument. For example:

-ic:\temp\InputFile.xml

or:

-i c:\temp\InputFile.xml

To view the list of command line options, run TransformXml without any parameters or with the -? switch. Below is a list of all the command line switches:

SwitchValueNotes
-?Display the command line switches. If used, it must be the only parameter.
-iInput file name or URI containing the XML to transformIf omitted, the XML is read from the standard input stream.
-sStylesheet file name or URII.e., the .xsl or .xslt file.
-tTransform file name or URIIdentical to the -s switch.
-oGenerated output fileIf omitted, the text will be sent to the console's output stream.
-eError log output file
-aOne or more parameters in the format:
ParameterName=Value
Binds a value to the matching <xsl:param> parameter in the stylesheet.
-xXML namespace;
assembly name/path;
class name;
[optional additional assembly references]
Define extension objects. If the class has a default constructor, then it is instantiated, and its methods may be called from the stylesheet. If it is a static class, or a class with static methods but no default constructor, then an instantiable wrapper class will be generated dynamically with the corresponding static methods (which will call the wrapped class' static methods). Additional assembly references may be required for generating the wrapper class.
-d+ or -Allow the stylesheet to call the document() function. Off by default, for security reasons.
-m+ or -Allow script blocks within the stylesheet. Off by default for security reasons.
-r+ or -Allow access to external resources (e.g., using the <xsl:import> and <xsl:include> directives). Allowed by default.
-g+ or -Enable XSLT debugging mode. This allows the developer to step into the stylesheet while debugging the code within the Visual Studio 2005 IDE.
-v+, - or *Verbose mode. On by default. Shows extra progress information.
This is ignored if the XML output is being written to the console instead of to a file. The * option is a special case of verbose mode. It also displays the dynamically generated C# code which creates extension object wrappers for static classes.
-p+ or -Prompt to exit. Off by default. This is useful when running the utility in debug mode, as it gives you a chance to see the results before the console window disappears.

Points of interest

Creating extension objects

XslCompiledTransform.Transform() has many overloads. TransformXml uses the overload with the following parameters:

  • XmlReader input
  • XsltArgumentList arguments
  • XmlWriter results
  • XmlResolver document resolver

XsltArgumentList.AddExtensionObject(string namespaceUri, Object extension) is used to add any .NET object as an extension object. namespaceUri is an XML namespace which will be used within the stylesheet to identify the extension object to call. Any public method of that object can be called from within the stylesheet, with certain restrictions (for example, methods with a variable number of arguments are not supported).

TransformXml uses the -x parameter to specify the namespace URI, as well as the assembly and class for the extension object. It uses this information to instantiate the object and add it as an extension object.

The code to do this is shown below. Note that extensionSetting is in the format namespaceUri;assemblyName;className[;assemblyRef1;assemblyRef2...].

C#
private static void ParseExtensionObjectParameter(
    string extensionSetting, out string namespaceUri,
    out object extensionObject, TextWriter dynamicCodeWriter)
{
    const string pattern
        = @"^(?:(?<namespace>[^;]*);)?(?<assembly>[^;]+);"
        + @"(?<class>[^;]+)(?:;(?<assemblyRef>[^;]+))*$";
    Regex extObjRegex = new Regex(pattern, RegexOptions.None);

    Match mat = extObjRegex.Match(extensionSetting);
    if (!mat.Success)
    {
        throw new Exception(String.Format(
            @"Invalid format for extension object "
            + @"parameter '{0}'.",
            extensionSetting));
    }

    if (mat.Groups["namespace"].Success)
    {
        namespaceUri = mat.Groups["namespace"].Value;
    }
    else
    {
        /* Use an empty string to denote the 
         * default namespace: 
         */
        namespaceUri = String.Empty;
    }

    string assemblyName = mat.Groups["assembly"].Value;
    string extensionClassName = mat.Groups["class"].Value;

    List<string> assemblyReferences = new List<string>();

    Group assemblyRefGroup = mat.Groups["assemblyRef"];

    if (assemblyRefGroup.Success)
    {
        foreach (Capture cap in assemblyRefGroup.Captures)
        {
            assemblyReferences.Add(cap.Value);
        }
    }

    Assembly extensionAssembly;

    /* There are 2 ways of loading assemblies.
     * The first uses the full path to the assembly.
     * The second also looks in the GAC.
     */
    if (File.Exists(assemblyName))
    {
        extensionAssembly = Assembly.LoadFrom(assemblyName);
    }
    else
    {
        extensionAssembly = Assembly.Load(assemblyName);
    }

    /* Check if the type is static, and if so, 
     * generate an object as a wrapper around it:
     */
    Type extensionClassType = extensionAssembly.GetType(
        extensionClassName, true /*throwOnError*/);

    bool hasPublicStaticMethods = false;
    bool hasDefaultConstructor = false;

    /* Search for a non-static method that can be called: */
    foreach (MethodInfo methInfo 
      in extensionClassType.GetMethods())
    {
        if (methInfo.IsConstructor)
        {
            if (!hasDefaultConstructor && methInfo.IsPublic
                && (methInfo.GetParameters().Length == 0))
            {
                hasDefaultConstructor = true;
            }
        }
        else
        {
            if (!hasPublicStaticMethods
                && methInfo.IsStatic && methInfo.IsPublic)
            {
                hasPublicStaticMethods = true;
            }
        }
    }

    if (!hasDefaultConstructor && hasPublicStaticMethods)
    {
        extensionObject = CreateObjectWrapperAroundStaticClass(
            extensionClassType, dynamicCodeWriter, 
            assemblyReferences);
    }
    else
    {
        extensionObject 
            = Activator.CreateInstance(extensionClassType);
    }
}

Note that the code can load assemblies from the GAC or from the file system. The assembly reference can include a path, or the assembly can be in the current folder, in which case Assembly.LoadFrom() is called. If the assembly file can't be found, then Assembly.Load() is called instead. In this case, it will either be loaded from the GAC, if it is there, or .NET will throw an exception because it can't be found.

[Note: This code highlights a limitation of TransformXml... non-static methods of classes which don't have a default constructor cannot be called! This is because TransformXml has no way of instantiating the class, because it doesn't know which constructor to use, and what parameters to pass to the constructor.

On the other hand, TransformXml provides a workaround for a restriction of XslCompiledTransform. This is discussed in the following section...]

Using CodeDOM to generate instantiable wrappers around static classes

The extension parameter in XsltArgumentList.AddExtensionObject(string namespaceUri, Object extension) must be an instantiated class. Ordinarily, this would mean that static classes can't be used as extension objects. I was unhappy with this restriction, because I wanted to use the static File class in one of my XSL stylesheets.

My solution to this problem was to generate a wrapper class with a default constructor, and with an identical "wrapper method" for each static method of the wrapped class.

Initially, I experimented with using System.Reflection.Emit to create the wrapper class using MSIL (Microsoft Intermediate Language). I quickly found myself out of my depth, as I had no previous knowledge of MSIL.

I then switched to using System.CodeDOM, which proved to be far simpler. You can find the code in StaticClassInstanceGenerator.cs.

Note that TransformXml will also generate a wrapper class around non-static classes which don't have a default constructor, but which do have static methods. These static methods can then be called from within the stylesheet.

Tip: You can view the generated code by using the -v* command line switch.

Below is the code generated for the System.IO.File class (re-formatted to fit into the web page comfortably):

C#
namespace AndrewTweddle.Tools.DynamicAssemblies
{
    
    public class FileWrapper
    {
        
        public FileWrapper()
        {
        }
        
        public void Delete(string path)
        {
            System.IO.File.Delete(path);
        }
        
        public bool Exists(string path)
        {
            return System.IO.File.Exists(path);
        }
        
        public System.IO.StreamReader OpenText(string path)
        {
            return System.IO.File.OpenText(path);
        }
        
        public System.IO.StreamWriter CreateText(string path)
        {
            return System.IO.File.CreateText(path);
        }
        
        public System.IO.StreamWriter AppendText(string path)
        {
            return System.IO.File.AppendText(path);
        }
        
        public void Copy(string sourceFileName, string destFileName)
        {
            System.IO.File.Copy(sourceFileName, destFileName);
        }
        
        public void Copy(string sourceFileName, string destFileName, 
          bool overwrite)
        {
            System.IO.File.Copy(sourceFileName, destFileName, 
                overwrite);
        }
        
        public System.IO.FileStream Create(string path)
        {
            return System.IO.File.Create(path);
        }
        
        public System.IO.FileStream Create(string path, int bufferSize)
        {
            return System.IO.File.Create(path, bufferSize);
        }
        
        public System.IO.FileStream Create(string path, int bufferSize, 
            System.IO.FileOptions options)
        {
            return System.IO.File.Create(path, bufferSize, options);
        }
        
        public System.IO.FileStream Create(string path, 
            int bufferSize, System.IO.FileOptions options, 
            System.Security.AccessControl.FileSecurity fileSecurity)
        {
            return System.IO.File.Create(path, bufferSize, options, 
                fileSecurity);
        }
        
        public void Decrypt(string path)
        {
            System.IO.File.Decrypt(path);
        }
        
        public void Encrypt(string path)
        {
            System.IO.File.Encrypt(path);
        }
        
        public System.IO.FileStream Open(string path, 
            System.IO.FileMode mode)
        {
            return System.IO.File.Open(path, mode);
        }
        
        public System.IO.FileStream Open(string path, 
            System.IO.FileMode mode, System.IO.FileAccess access)
        {
            return System.IO.File.Open(path, mode, access);
        }
        
        public System.IO.FileStream Open(string path, 
            System.IO.FileMode mode, System.IO.FileAccess access, 
            System.IO.FileShare share)
        {
            return System.IO.File.Open(path, mode, access, share);
        }
        
        public void SetCreationTime(string path, 
            System.DateTime creationTime)
        {
            System.IO.File.SetCreationTime(path, creationTime);
        }
        
        public void SetCreationTimeUtc(string path, 
            System.DateTime creationTimeUtc)
        {
            System.IO.File.SetCreationTimeUtc(path, creationTimeUtc);
        }
        
        public System.DateTime GetCreationTime(string path)
        {
            return System.IO.File.GetCreationTime(path);
        }
        
        public System.DateTime GetCreationTimeUtc(string path)
        {
            return System.IO.File.GetCreationTimeUtc(path);
        }
        
        public void SetLastAccessTime(string path, 
            System.DateTime lastAccessTime)
        {
            System.IO.File.SetLastAccessTime(path, lastAccessTime);
        }
        
        public void SetLastAccessTimeUtc(string path, 
            System.DateTime lastAccessTimeUtc)
        {
            System.IO.File.SetLastAccessTimeUtc(path, 
                lastAccessTimeUtc);
        }
        
        public System.DateTime GetLastAccessTime(string path)
        {
            return System.IO.File.GetLastAccessTime(path);
        }
        
        public System.DateTime GetLastAccessTimeUtc(string path)
        {
            return System.IO.File.GetLastAccessTimeUtc(path);
        }
        
        public void SetLastWriteTime(string path, 
            System.DateTime lastWriteTime)
        {
            System.IO.File.SetLastWriteTime(path, lastWriteTime);
        }
        
        public void SetLastWriteTimeUtc(string path, 
            System.DateTime lastWriteTimeUtc)
        {
            System.IO.File.SetLastWriteTimeUtc(path, lastWriteTimeUtc);
        }
        
        public System.DateTime GetLastWriteTime(string path)
        {
            return System.IO.File.GetLastWriteTime(path);
        }
        
        public System.DateTime GetLastWriteTimeUtc(string path)
        {
            return System.IO.File.GetLastWriteTimeUtc(path);
        }
        
        public System.IO.FileAttributes GetAttributes(string path)
        {
            return System.IO.File.GetAttributes(path);
        }
        
        public void SetAttributes(string path, 
            System.IO.FileAttributes fileAttributes)
        {
            System.IO.File.SetAttributes(path, fileAttributes);
        }
        
        public System.Security.AccessControl.FileSecurity 
            GetAccessControl(string path)
        {
            return System.IO.File.GetAccessControl(path);
        }
        
        public System.Security.AccessControl.FileSecurity 
            GetAccessControl(string path, 
            System.Security.AccessControl.AccessControlSections 
                includeSections)
        {
            return System.IO.File.GetAccessControl(path, 
                includeSections);
        }
        
        public void SetAccessControl(string path, 
            System.Security.AccessControl.FileSecurity fileSecurity)
        {
            System.IO.File.SetAccessControl(path, fileSecurity);
        }
        
        public System.IO.FileStream OpenRead(string path)
        {
            return System.IO.File.OpenRead(path);
        }
        
        public System.IO.FileStream OpenWrite(string path)
        {
            return System.IO.File.OpenWrite(path);
        }
        
        public string ReadAllText(string path)
        {
            return System.IO.File.ReadAllText(path);
        }
        
        public string ReadAllText(string path, 
            System.Text.Encoding encoding)
        {
            return System.IO.File.ReadAllText(path, encoding);
        }
        
        public void WriteAllText(string path, string contents)
        {
            System.IO.File.WriteAllText(path, contents);
        }
        
        public void WriteAllText(string path, string contents, 
            System.Text.Encoding encoding)
        {
            System.IO.File.WriteAllText(path, contents, encoding);
        }
        
        public byte[] ReadAllBytes(string path)
        {
            return System.IO.File.ReadAllBytes(path);
        }
        
        public void WriteAllBytes(string path, byte[] bytes)
        {
            System.IO.File.WriteAllBytes(path, bytes);
        }
        
        public string[] ReadAllLines(string path)
        {
            return System.IO.File.ReadAllLines(path);
        }
        
        public string[] ReadAllLines(string path, 
            System.Text.Encoding encoding)
        {
            return System.IO.File.ReadAllLines(path, encoding);
        }
        
        public void WriteAllLines(string path, string[] contents)
        {
            System.IO.File.WriteAllLines(path, contents);
        }
        
        public void WriteAllLines(string path, string[] contents, 
            System.Text.Encoding encoding)
        {
            System.IO.File.WriteAllLines(path, contents, encoding);
        }
        
        public void AppendAllText(string path, string contents)
        {
            System.IO.File.AppendAllText(path, contents);
        }
        
        public void AppendAllText(string path, string contents, 
            System.Text.Encoding encoding)
        {
            System.IO.File.AppendAllText(path, contents, encoding);
        }
        
        public void Move(string sourceFileName, string destFileName)
        {
            System.IO.File.Move(sourceFileName, destFileName);
        }
        
        public void Replace(string sourceFileName, 
            string destinationFileName, 
            string destinationBackupFileName)
        {
            System.IO.File.Replace(sourceFileName, 
                destinationFileName, destinationBackupFileName);
        }
        
        public void Replace(string sourceFileName, 
            string destinationFileName, 
            string destinationBackupFileName, bool ignoreMetadataErrors)
        {
            System.IO.File.Replace(sourceFileName, destinationFileName, 
                destinationBackupFileName, ignoreMetadataErrors);
        }
    }
}

To be continued...

I will be posting follow-up articles on SendSmtp and the SQL daily build error reporting utility. The latter article will contain a lot more details on XSLT and TransformXml. Stay tuned...

History

  • 23 November 2006
    • Initial version submitted.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here


Written By
Architect Dariel Solutions
South Africa South Africa
Andrew Tweddle started his career as an Operations Researcher, but made the switch to programming in 1997. His current programming passions are Powershell and WPF.

He has worked for one of the "big 4" banks in South Africa as a software team lead and an architect, at a Dynamics CRM consultancy and is currently an architect at Dariel Solutions working on software for a leading private hospital network.

Before that he spent 7 years at SQR Software in Pietermaritzburg, where he was responsible for the resource planning and budgeting module in CanePro, their flagship product for the sugar industry.

He enjoys writing utilities to streamline the software development and deployment process. He believes Powershell is a killer app for doing this.

Andrew is a board game geek (see www.boardgamegeek.com) with a collection of over 190 games! He also enjoys digital photography, camping and solving puzzles - especially Mathematics problems.

His Myers-Briggs personality profile is INTJ.

He lives with his wife, Claire and his daughters Lauren and Catherine in Johannesburg, South Africa.

Comments and Discussions

 
QuestionWhy poor man version of nxslt? Pin
helgy28-Nov-06 9:37
helgy28-Nov-06 9:37 
AnswerRe: Why poor man version of nxslt? Pin
Andrew Tweddle29-Nov-06 1:03
Andrew Tweddle29-Nov-06 1:03 
GeneralRe: Why poor man version of nxslt? Pin
helgy30-Nov-06 5:07
helgy30-Nov-06 5:07 
GeneralRe: Why poor man version of nxslt? Pin
Dustin Metzgar30-Nov-06 5:37
Dustin Metzgar30-Nov-06 5:37 
GeneralRe: Why poor man version of nxslt? Pin
Andrew Tweddle1-Dec-06 23:46
Andrew Tweddle1-Dec-06 23:46 
GeneralRe: Why poor man version of nxslt? Pin
balver7-Jul-11 1:32
balver7-Jul-11 1:32 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.