Click here to Skip to main content
15,868,016 members
Articles / Programming Languages / XML

Remove Space in .NET Serialization of Empty XML Element

Rate me:
Please Sign up or sign in to vote.
5.00/5 (2 votes)
9 Nov 2012CPOL2 min read 24.5K   5   2
Shows how to get the .NET XmlWriter class to write empty element tags

Introduction

When serializing an XML document in .NET, the XmlWriter class always inserts a space in empty element tags. Instead of the more compact e.g., <elem/>, it issues <elem />. There is no setting to switch this off. I did not like that superfluous space and thus looked for a way to work around this.

Background

Why does Microsoft think there should be a space before the closing "/>"? After a bit of search, I found that XHTML 1 recommended this in its HTML Compatibility Guidelines. XHTML is designed to be kind of backwards compatible to HTML. Older HTML parsers recognized the "<br/>" tag not as br but as a tag br/. Putting a space before the "/>" let them recognize br and ignore the "/>".

But I see no reason to always insert this space when I write XML that is not XHTML.

Some people suggested to serialize the XML to a string and then replace all occurrences of " />" by "/>". But this is dangerous: this character sequence may also appear inside of CDATA sections.

Coming up with no better idea, the problem rested for a while. Recently, I saw a different approach which dived deep into the .NET Runtime to manipulate the character buffer after the empty end tag was written. Whoa! This worked, but it required to call the new method TrimEndElement() after each call to WriteEndElement(). So this works only if you call XmlWriter directly from your code, but not if you want to use it, e.g., with XmlDocument.

Using the Code

So I wrote a wrapper class MtxXmlWriter that is derived from XmlWriter and wraps the original XmlWriter returned by XmlWriter.Create() and does all the necessary tricks.

Instead of using XmlWriter.Create(), you just call one of the MtxXmlWriter.Create() methods, that's all. All other methods are directly handed over to the encapsulated original XmlWriter except for WriteEndElement(). After calling WriteEndElement() of the encapsulated XmlWriter, " />" is replaced with "/>" in the buffer:

C#
this.xw.WriteEndElement();
// trim the end of element
if (this.bufferType == BufferType.Chars)
{
    int bufPos = get_bufPos(internalWriter);
    char[] bufChars = get_bufChars(internalWriter);
    if (bufPos > 3 && bufChars[bufPos - 3] == ' ' && 
           bufChars[bufPos - 2] == '/' && bufChars[bufPos - 1] == '>')
    {
        bufChars[bufPos - 3] = '/';
        bufChars[bufPos - 2] = '>';
        bufPos--;
        set_bufPos(internalWriter, bufPos);
    }
}
else if (this.bufferType == BufferType.Bytes)
{
    int bufPos = get_bufPos(internalWriter);
    byte[] bufBytes = get_bufBytes(internalWriter);
    if (bufPos > 3 && bufBytes[bufPos - 3] == ' ' && 
          bufBytes[bufPos - 2] == '/' && bufBytes[bufPos - 1] == '>')
    {
        bufBytes[bufPos - 3] = (byte)'/';
        bufBytes[bufPos - 2] = (byte)'>';
        bufPos--;
        set_bufPos(internalWriter, bufPos);
    }
}

Notice there are two types of buffers depending on which internal XmlWriter class is used. Most of them work on a character buffer, but the ones dealing with UTF-8 work on a byte buffer.

But what if the buffer gets full during WriteElementEnd() so it gets flushed and we can't modify the contents afterwards? To prevent this, we flush the buffer before calling WriteElementEnd() on the encapsulated XmlWriter when there is not enough space left:

C#
// ensure that there is enough space in the buffer for the element end
if (this.bufferType != BufferType.Unknown)
{
    int bufPos = get_bufPos(internalWriter);
    int bufLen = get_bufLen(internalWriter);
    if ((bufPos + 3) >= bufLen)
    {
        this.FlushBuffer(internalWriter);
    }
} 

All these internal methods and properties that you see used in the above code are dug out during Create() using System.Reflection.

The full code is as follows:

C#
using System;
using System.Collections.Generic;
using System.IO;
using System.Reflection;
using System.Text;
using System.Xml;
using System.Xml.XPath;
using System.Xml.Schema;

namespace Mtx
{
    // This class wraps an XmlWriter and changes serialization of 
    // empty elements from <elemname /> to <elemname/>
    class MtxXmlWriter : XmlWriter
    {
        // The original XmlWriter that we are wrapping in this class
        private XmlWriter xw;

        // Access to the char / byte buffer of the internal XmlRawWriter
        object internalWriter;
        private Func<object, char[]> get_bufChars;
        private Func<object, byte[]> get_bufBytes;
        private Func<object, int> get_bufPos;
        private Action<object, int> set_bufPos;
        private Func<object, int> get_bufLen;
        private Action<object> FlushBuffer;

        // The type of the internal XmlRawWriter's buffer
        private enum BufferType
        {
            Unknown,
            Chars,
            Bytes
        }
        private BufferType bufferType = BufferType.Unknown;

        public override XmlWriterSettings Settings
        {
            get { return this.xw.Settings; }
        }

        public override WriteState WriteState
        {
            get { return this.xw.WriteState; }
        }

        public override string XmlLang
        {
            get { return this.xw.XmlLang; }
        }

        public override XmlSpace XmlSpace
        {
            get { return this.xw.XmlSpace; }
        }

        public MtxXmlWriter(XmlWriter xw)
        {
            this.xw = xw;

            // Get at the XmlRawWriter inside the XmlWriter
            Assembly asm = Assembly.GetAssembly(typeof(XmlWriter));
            Type xmlWellFormedWriterType = asm.GetType("System.Xml.XmlWellFormedWriter");
            BindingFlags flags = BindingFlags.NonPublic | BindingFlags.Instance;
            FieldInfo writerField = xmlWellFormedWriterType.GetField("writer", flags);
            Func<XmlWriter, object> get_writer = w => writerField.GetValue(w);
            this.internalWriter = get_writer(this.xw);

            // Get at the char / byte buffer of the internal XmlWriter
            Type internalWriterType = this.internalWriter.GetType();
            Type xmlEncodedRawTextWriterType       = asm.GetType("System.Xml.XmlEncodedRawTextWriter");
            Type xmlEncodedRawTextWriterIndentType = 
                          asm.GetType("System.Xml.XmlEncodedRawTextWriterIndent");
            Type xmlUtf8RawTextWriterType          = asm.GetType("System.Xml.XmlUtf8RawTextWriter");
            Type xmlUtf8RawTextWriterIndentType    = 
                          asm.GetType("System.Xml.XmlUtf8RawTextWriterIndent");
            FieldInfo bufCharsBytesField;
            FieldInfo bufPosField;
            FieldInfo bufLenField;
            MethodInfo flushBufferMethod;
            if (internalWriterType == xmlEncodedRawTextWriterType)
            {
                this.bufferType = BufferType.Chars;
                bufCharsBytesField = xmlEncodedRawTextWriterType.GetField("bufChars", flags);
                bufPosField        = xmlEncodedRawTextWriterType.GetField("bufPos", flags);
                bufLenField        = xmlEncodedRawTextWriterType.GetField("bufLen", flags);
                flushBufferMethod  = xmlEncodedRawTextWriterType.GetMethod("FlushBuffer", flags);
            }
            else if (internalWriterType == xmlEncodedRawTextWriterIndentType)
            {
                this.bufferType = BufferType.Chars;
                bufCharsBytesField = xmlEncodedRawTextWriterIndentType.GetField("bufChars", flags);
                bufPosField        = xmlEncodedRawTextWriterIndentType.GetField("bufPos", flags);
                bufLenField        = xmlEncodedRawTextWriterIndentType.GetField("bufLen", flags);
                flushBufferMethod  = xmlEncodedRawTextWriterIndentType.GetMethod("FlushBuffer", flags);
            }
            else if (internalWriterType == xmlUtf8RawTextWriterType)
            {
                this.bufferType = BufferType.Bytes;
                bufCharsBytesField = xmlUtf8RawTextWriterType.GetField("bufBytes", flags);
                bufPosField        = xmlUtf8RawTextWriterType.GetField("bufPos", flags);
                bufLenField        = xmlUtf8RawTextWriterType.GetField("bufLen", flags);
                flushBufferMethod  = xmlUtf8RawTextWriterType.GetMethod("FlushBuffer", flags);
            }
            else if (internalWriterType == xmlUtf8RawTextWriterIndentType)
            {
                this.bufferType = BufferType.Bytes;
                bufCharsBytesField = xmlUtf8RawTextWriterIndentType.GetField("bufBytes", flags);
                bufPosField        = xmlUtf8RawTextWriterIndentType.GetField("bufPos", flags);
                bufLenField        = xmlUtf8RawTextWriterIndentType.GetField("bufLen", flags);
                flushBufferMethod  = xmlUtf8RawTextWriterIndentType.GetMethod("FlushBuffer", flags);
            }
            else
            {
                this.bufferType = BufferType.Unknown;
                bufCharsBytesField = null;
                bufPosField        = null;
                bufLenField        = null;
                flushBufferMethod  = null;
                System.Diagnostics.Debug.WriteLine("Unkown internal XmlWriter class");
            }
            switch (this.bufferType)
            {
                case BufferType.Unknown:
                    break;
                case BufferType.Chars:
                    this.get_bufChars = w => (char[])bufCharsBytesField.GetValue(w);
                    this.get_bufPos = w => (int)bufPosField.GetValue(w);
                    this.set_bufPos = (w, i) => bufPosField.SetValue(w, i);
                    this.get_bufLen = w => (int)bufLenField.GetValue(w);
                    this.FlushBuffer = w => flushBufferMethod.Invoke(w, new object[0]);
                    break;
                case BufferType.Bytes:
                    this.get_bufBytes = w => (byte[])bufCharsBytesField.GetValue(w);
                    this.get_bufPos = w => (int)bufPosField.GetValue(w);
                    this.set_bufPos = (w, i) => bufPosField.SetValue(w, i);
                    this.get_bufLen = w => (int)bufLenField.GetValue(w);
                    this.FlushBuffer = w => flushBufferMethod.Invoke(w, new object[0]);
                    break;
            }
        }

        public override void Close()
        {
            this.xw.Close();
        }

        public static new MtxXmlWriter Create(Stream output)
        {
            return new MtxXmlWriter(XmlWriter.Create(output));
        }

        public static new MtxXmlWriter Create(string outputFileName)
        {
            return new MtxXmlWriter(XmlWriter.Create(outputFileName));
        }

        public static new MtxXmlWriter Create(StringBuilder output)
        {
            return new MtxXmlWriter(XmlWriter.Create(output));
        }

        public static new MtxXmlWriter Create(TextWriter output)
        {
            return new MtxXmlWriter(XmlWriter.Create(output));
        }

        public static new MtxXmlWriter Create(XmlWriter output)
        {
            return new MtxXmlWriter(XmlWriter.Create(output));
        }

        public static new MtxXmlWriter Create(string outputFileName, XmlWriterSettings settings)
        {
            return new MtxXmlWriter(XmlWriter.Create(outputFileName, settings));
        }

        public static new MtxXmlWriter Create(Stream output, XmlWriterSettings settings)
        {
            return new MtxXmlWriter(XmlWriter.Create(output, settings));
        }

        public static new MtxXmlWriter Create(StringBuilder output, XmlWriterSettings settings)
        {
            return new MtxXmlWriter(XmlWriter.Create(output, settings));
        }

        public static new MtxXmlWriter Create(TextWriter output, XmlWriterSettings settings)
        {
            return new MtxXmlWriter(XmlWriter.Create(output, settings));
        }

        public static new MtxXmlWriter Create(XmlWriter output, XmlWriterSettings settings)
        {
            return new MtxXmlWriter(XmlWriter.Create(output, settings));
        }

        public override void Flush()
        {
            this.xw.Flush();
        }

        public override string LookupPrefix(string ns)
        {
            return this.xw.LookupPrefix(ns);
        }

        public override void WriteAttributes(XmlReader reader, bool defattr)
        {
            this.xw.WriteAttributes(reader, defattr);
        }

        public new void WriteAttributeString(string localName, string value)
        {
            this.xw.WriteAttributeString(localName, value);
        }

        public new void WriteAttributeString(string localName, string ns, string value)
        {
            this.xw.WriteAttributeString(localName, ns, value);
        }

        public new void WriteAttributeString(string prefix, string localName, string ns, string value)
        {
            this.xw.WriteAttributeString(prefix, localName, ns, value);
        }

        public override void WriteBase64(byte[] buffer, int index, int count)
        {
            this.xw.WriteBase64(buffer, index, count);
        }

        public override void WriteBinHex(byte[] buffer, int index, int count)
        {
            this.xw.WriteBinHex(buffer, index, count);
        }

        public override void WriteCData(string text)
        {
            this.xw.WriteCData(text);
        }

        public override void WriteCharEntity(char ch)
        {
            this.xw.WriteCharEntity(ch);
        }

        public override void WriteChars(char[] buffer, int index, int count)
        {
            this.xw.WriteChars(buffer, index, count);
        }

        public override void WriteComment(string text)
        {
            this.xw.WriteComment(text);
        }

        public override void WriteDocType(string name, string pubid, string sysid, string subset)
        {
            this.xw.WriteDocType(name, pubid, sysid, subset);
        }

        public new void WriteElementString(string localName, string value)
        {
            this.xw.WriteElementString(localName, value);
        }

        public new void WriteElementString(string localName, string ns, string value)
        {
            this.xw.WriteElementString(localName, ns, value);
        }

        public new void WriteElementString(string prefix, string localName, string ns, string value)
        {
            this.xw.WriteElementString(prefix, localName, ns, value);
        }

        public override void WriteEndAttribute()
        {
            this.xw.WriteEndAttribute();
        }

        public override void WriteEndDocument()
        {
            this.xw.WriteEndDocument();
        }

        public override void WriteEndElement()
        {
            // ensure that there is enough space in the buffer for the element end
            if (this.bufferType != BufferType.Unknown)
            {
                int bufPos = get_bufPos(internalWriter);
                int bufLen = get_bufLen(internalWriter);
                if ((bufPos + 3) >= bufLen)
                {
                    this.FlushBuffer(internalWriter);
                }
            }
            this.xw.WriteEndElement();
            // trim the end of element
            if (this.bufferType == BufferType.Chars)
            {
                    int bufPos = get_bufPos(internalWriter);
                    char[] bufChars = get_bufChars(internalWriter);
                    if (bufPos > 3 && bufChars[bufPos - 3] == ' ' && 
                         bufChars[bufPos - 2] == '/' && bufChars[bufPos - 1] == '>')
                    {
                        bufChars[bufPos - 3] = '/';
                        bufChars[bufPos - 2] = '>';
                        bufPos--;
                        set_bufPos(internalWriter, bufPos);
                    }
            }
            else if (this.bufferType == BufferType.Bytes)
            {
                    int bufPos = get_bufPos(internalWriter);
                    byte[] bufBytes = get_bufBytes(internalWriter);
                    if (bufPos > 3 && bufBytes[bufPos - 3] == ' ' && 
                         bufBytes[bufPos - 2] == '/' && bufBytes[bufPos - 1] == '>')
                    {
                        bufBytes[bufPos - 3] = (byte)'/';
                        bufBytes[bufPos - 2] = (byte)'>';
                        bufPos--;
                        set_bufPos(internalWriter, bufPos);
                    }
            }
        }

        public override void WriteEntityRef(string name)
        {
            this.xw.WriteEntityRef(name);
        }

        public override void WriteFullEndElement()
        {
            this.xw.WriteFullEndElement();
        }

        public override void WriteName(string name)
        {
            this.xw.WriteName(name);
        }

        public override void WriteNmToken(string name)
        {
            this.xw.WriteNmToken(name);
        }

        public override void WriteNode(XmlReader reader, bool defattr)
        {
            this.xw.WriteNode(reader, defattr);
        }

        public override void WriteNode(XPathNavigator navigator, bool defattr)
        {
            this.xw.WriteNode(navigator, defattr);
        }

        public override void WriteProcessingInstruction(string name, string text)
        {
            this.xw.WriteProcessingInstruction(name, text);
        }

        public override void WriteQualifiedName(string localName, string ns)
        {
            this.xw.WriteQualifiedName(localName, ns);
        }

        public override void WriteRaw(string data)
        {
            this.xw.WriteRaw(data);
        }

        public override void WriteRaw(char[] buffer, int index, int count)
        {
            this.xw.WriteRaw(buffer, index, count);
        }

        public new void WriteStartAttribute(string localName)
        {
            this.xw.WriteStartAttribute(localName);
        }

        public new void WriteStartAttribute(string localName, string ns)
        {
            this.xw.WriteStartAttribute(localName, ns);
        }

        public override void WriteStartAttribute(string prefix, string localName, string ns)
        {
            this.xw.WriteStartAttribute(prefix, localName, ns);
        }

        public override void WriteStartDocument()
        {
            this.xw.WriteStartDocument();
        }

        public override void WriteStartDocument(bool standalone)
        {
            this.xw.WriteStartDocument(standalone);
        }

        public new void WriteStartElement(string localName)
        {
            this.xw.WriteStartElement(localName);
        }

        public new void WriteStartElement(string localName, string ns)
        {
            this.xw.WriteStartElement(localName, ns);
        }

        public override void WriteStartElement(string prefix, string localName, string ns)
        {
            this.xw.WriteStartElement(prefix, localName, ns);
        }

        public override void WriteString(string text)
        {
            this.xw.WriteString(text);
        }

        public override void WriteSurrogateCharEntity(char lowChar, char highChar)
        {
            this.xw.WriteSurrogateCharEntity(lowChar, highChar);
        }

        public override void WriteValue(bool value)
        {
            this.xw.WriteValue(value);
        }

        public override void WriteValue(DateTime value)
        {
            this.xw.WriteValue(value);
        }

        public override void WriteValue(decimal value)
        {
            this.xw.WriteValue(value);
        }

        public override void WriteValue(double value)
        {
            this.xw.WriteValue(value);
        }

        public override void WriteValue(int value)
        {
            this.xw.WriteValue(value);
        }

        public override void WriteValue(long value)
        {
            this.xw.WriteValue(value);
        }

        public override void WriteValue(Object value)
        {
            this.xw.WriteValue(value);
        }

        public override void WriteValue(float value)
        {
            this.xw.WriteValue(value);
        }

        public override void WriteValue(string value)
        {
            this.xw.WriteValue(value);
        }

        public override void WriteWhitespace(string value)
        {
            this.xw.WriteWhitespace(value);
        }
    }
}

History

  • 2012-11-09: Initial release

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Architect
Germany Germany
I am working in the automation industry, on specifications targetting the engineering of automation systems.

Comments and Discussions

 
GeneralThoughts Pin
PIEBALDconsult28-May-14 15:29
mvePIEBALDconsult28-May-14 15:29 
GeneralMy vote of 5 Pin
johannesnestler25-Nov-13 4:26
johannesnestler25-Nov-13 4:26 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.