Click here to Skip to main content
Click here to Skip to main content
Technical Blog

Tagged as

XML Serialization – Tips & Tricks

, 2 Apr 2010 CPOL
Rate this:
Please Sign up or sign in to vote.
This article shows solutions to some of the common problems related to working with XML Serialization.

Let’s say we have an XSD representing a library with a list of books and employees in it.

<?xml version="1.0" encoding="utf-8"?>

<xsd:schema id="Lib"
           targetNamespace="http://schemas.ali.com/lib/"
           elementFormDefault="qualified"
           xmlns="http://schemas.ali.com/lib/"
           xmlns:mstns="http://schemas.ali.com/lib/"
           xmlns:xsd="http://www.w3.org/2001/XMLSchema"
           version="1.0"
           attributeFormDefault="unqualified">

  <xsd:element name="Library" type="LibraryType" />

  <xsd:complexType name="LibraryType">
    <xsd:all>
      <xsd:element name="Books" type="BooksType" minOccurs="0" maxOccurs="1" />
      <xsd:element name="Employees" type="EmployeesType" minOccurs="0" maxOccurs="1" />
    </xsd:all>
  </xsd:complexType>

  <xsd:complexType name="BooksType">
    <xsd:sequence minOccurs="0" maxOccurs="unbounded">
      <xsd:element name="Book" type="BookType" />
    </xsd:sequence>
  </xsd:complexType>

  <xsd:complexType name="EmployeesType">
    <xsd:sequence minOccurs="0" maxOccurs="unbounded">
      <xsd:element name="Employee" type="EmployeeType" />
    </xsd:sequence>
  </xsd:complexType>

  <xsd:complexType name="BookType">
    <xsd:attribute name="Title" type="xsd:string" use="required" />
    <xsd:attribute name="Author" type="xsd:string" use="optional" />
  </xsd:complexType>

  <xsd:complexType name="EmployeeType">
    <xsd:attribute name="Name" type="xsd:string" use="required" />
  </xsd:complexType>

</xsd:schema>

Here is a sample XML using this XSD:

<?xml version="1.0" encoding="utf-8" ?>

<Library xmlns="http://schemas.ali.com/lib/">
  <Books>
    <Book Title="Book 1" Author="Ali"/>
    <Book Title="Book 2" Author="Sara"/>
  </Books>
  <Employees>
    <Employee Name="Ali"/>
    <Employee Name="Sara"/>
  </Employees>
</Library>

Tip 1 – Generating Code from XSD

We’d like to have an object representation of this XML. Thus, we’ll use the XML Schema Definition tool to generate .NET C# code from the XSD, as follows:

  • Start Visual Studio Command Prompt
  • Run this command: xsd “path to XSD file” -language:CS /classes /outputdir:”path to output directory”

As a result, a .cs file will be generated and copied to the output directory. Take a look at the file GeneratedLibrary.cs.

Tip 2 – Using List<T> instead of array

The <Books> and <Employees> elements are generated as arrays. So I would like to change those arrays to List<T> objects to make it easier to add items to them instead of having to worry about the size of the array and expanding it. Take a look at the modified class LibraryWithLists.cs. However, this is still not good enough because if I want to create a library with one book, I’ll have to write the following code:

LibraryType library = new LibraryType();
library.Books = new BooksType();
library.Books.Book = new List<BookType>();
BookType newBook = new BookType();
newBook.Title = "Book 1";
newBook.Author = "Author 1";
library.Books.Book.Add(newBook);

But that doesn’t seem neat enough. I want to write something like:

LibraryType library = new LibraryType();
library.Books = new List<BookType>();
BookType newBook = new BookType();
newBook.Title = "Book 1";
newBook.Author = "Author 1";
library.Books.Add(newBook);

Thus, I’ll need to get rid of the class BooksType and change it in the declaration from “private BooksType booksField;” to “private List<BookType> booksField;”. However, making that change only is not enough. We need to tell the serializer that the new property is an XmlArrayItem and not an XmlElement. Take a look at the resulting code in Library.cs.

Tip 3 – Serializing Object to XML

Now, we should be able to write code and generate XML from the object created. For example, writing the code below:

//  Create a library
LibraryType library = new LibraryType();

//  Create Books tag
library.Books = new List<BookType>();

//  Add 5 books to the library
for (int i = 1; i <= 5; i++)
{
    BookType book = new BookType();
    book.Title = string.Format("Book {0}", i);
    book.Author = string.Format("Author {0}", i);
    library.Books.Add(book);
}

//  Create employees tag
library.Employees = new List<EmployeeType>();

//  Add 3 employees to the library
for (int i = 1; i <= 3; i++)
{
    EmployeeType employee = new EmployeeType();
    employee.Name = string.Format("Book {0}", i);
    library.Employees.Add(employee);
}

//  Now that the object is created, serialize it and print out resulting Xml
XmlSerializer serializer = new XmlSerializer(typeof(LibraryType));
StringWriter sw = new StringWriter();
serializer.Serialize(sw, library);
Console.WriteLine("Object serialized to Xml:\n\n{0}", sw.ToString());

would result in the following XML:

<?xml version="1.0" encoding="utf-16"?>
<Library xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance 
   xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns="http://schemas.ali.com/lib/">
  <Books>
    <BookType Title="Book 1" Author="Author 1" />
    <BookType Title="Book 2" Author="Author 2" />
    <BookType Title="Book 3" Author="Author 3" />
    <BookType Title="Book 4" Author="Author 4" />
    <BookType Title="Book 5" Author="Author 5" />
  </Books>
  <Employees>
    <EmployeeType Name="Book 1" />
    <EmployeeType Name="Book 2" />
    <EmployeeType Name="Book 3" />
  </Employees>
</Library>

Tip 4 – Serializing without Namespace

If you look at the serialized XML above, you’ll notice the extra xmlns:xsi=”http://www.w3.org/2001/XMLSchema-instance“, xmlns:xsd=”http://www.w3.org/2001/XMLSchema” and xmlns=”http://schemas.ali.com/lib/“. These namespaces are added by default. In my case, I only care about the xmlns=”http://schemas.ali.com/lib/” which is the URL for the XSD of my XML file. To get rid of the above namespaces and keep the one referring to the XSD, we’ll need to pass our custom XmlSerializerNamespaces object to the Serialize() method.

//  Create our own xml serializer namespace
//  Avoiding default xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
//  and xmlns:xsd="http://www.w3.org/2001/XMLSchema"
XmlSerializerNamespaces ns = new XmlSerializerNamespaces(); 

//  Add lib namespace with empty prefix
ns.Add("", "http://schemas.ali.com/lib/"); 

//  Now serialize by passing the XmlSerializerNamespaces object
//  as a parameter to the Serialize() method
XmlSerializer serializer = new XmlSerializer(typeof(LibraryType));
StringWriter sw = new StringWriter();
serializer.Serialize(sw, library, ns);
Console.WriteLine("Object serialized to Xml:\n\n{0}", sw.ToString());

If you want to get rid of the namespaces altogether, you can simply write ns.Add(“”, “”) instead of ns.Add(“”, ““). http://schemas.ali.com/lib/

Tip 5 – Changing Encoding

If you look at the generated XML above, you’ll notice in the XML declaration that the encoding is set to utf-16. To make this UTF8 encoding, we’ll need to change the Stream object settings before we do the serialization. To do this, replace the code below...

StringWriter sw = new StringWriter();
serializer.Serialize(sw, library, ns);

... with:

//  Serialize the object to Xml with UTF8 encoding
MemoryStream ms = new MemoryStream();
XmlTextWriter xmlTextWriter = new XmlTextWriter(ms, Encoding.UTF8);
xmlTextWriter.Formatting = Formatting.Indented;
serializer.Serialize(xmlTextWriter, library, ns);
ms = (MemoryStream)xmlTextWriter.BaseStream;
string xml = Encoding.UTF8.GetString(ms.ToArray());

To make this more generic, we can create a static method that can do serialization with any encoding.

/// <span class="code-SummaryComment"><summary>
</span>/// Serializes the object to Xml based on encoding and name spaces.
/// <span class="code-SummaryComment"></summary>
</span>/// <span class="code-SummaryComment"><param name="serializer"></param>
</span>/// <span class="code-SummaryComment"><param name="encoding"></param>
</span>/// <span class="code-SummaryComment"><param name="ns"></param>
</span>/// <span class="code-SummaryComment"><param name="objectToSerialize"></param>
</span>/// <span class="code-SummaryComment"><returns></returns>
</span>public static string Serialize(XmlSerializer serializer,
                           Encoding encoding,
                           XmlSerializerNamespaces ns,
                           object objectToSerialize)
{
    MemoryStream ms = new MemoryStream();
    XmlTextWriter xmlTextWriter = new XmlTextWriter(ms, encoding);
    xmlTextWriter.Formatting = Formatting.Indented;
    serializer.Serialize(xmlTextWriter, objectToSerialize, ns);
    ms = (MemoryStream)xmlTextWriter.BaseStream;
    return encoding.GetString(ms.ToArray());
}

Now we can write something like the below:

string xml = Serialize(serializer, Encoding.UTF8, ns, library);

Tip 6 – Removing XML Declaration

Let’s say you want to completely remove the XML Declarartion <?xml Version=”1.0? Encoding=”utf-8??> from your serialized XML. You can do so neatly by using an XmlWriterSettings class and setting its OmitXmlDeclaration property to true. Here is how the above Serialize method would change to support this:

Thus, we can do something like the below to omit the XML declaration:

/// <span class="code-SummaryComment"><summary>
</span>/// Serializes the object to XML based on encoding and name spaces.
/// <span class="code-SummaryComment"></summary>
</span>/// <span class="code-SummaryComment"><param name="serializer">XmlSerializer object 
</span>/// (passing as param to avoid creating one every time)<span class="code-SummaryComment"></param>
</span>/// <span class="code-SummaryComment"><param name="encoding">The encoding of the serialized Xml</param>
</span>/// <span class="code-SummaryComment"><param name="ns">The namespaces to be used by the serializer</param>
</span>/// <span class="code-SummaryComment"><param name="omitDeclaration">Whether to omit Xml declarartion or not</param>
</span>/// <span class="code-SummaryComment"><param name="objectToSerialize">The object we want to serialize to Xml</param>
</span>/// <span class="code-SummaryComment"><returns></returns>
</span>public static string Serialize(XmlSerializer serializer,
                               Encoding encoding,
                               XmlSerializerNamespaces ns,
                               bool omitDeclaration,
                               object objectToSerialize)
{
    MemoryStream ms = new MemoryStream();
    XmlWriterSettings settings = new XmlWriterSettings();
    settings.Indent = true;
    settings.OmitXmlDeclaration = omitDeclaration;
    settings.Encoding = encoding;
    XmlWriter writer = XmlWriter.Create(ms, settings);
    serializer.Serialize(writer, objectToSerialize, ns);
    return encoding.GetString(ms.ToArray()); ;
}

Tip 7 – Deserializing XML to Object

string xml = Serialize(serializer, Encoding.Default, ns, true, library);

This functionality is really cool. Load XML into an object I can understand and easily interact with. For example, you’d write the following code to read the contents of XML file “Sample1.xml” and convert it to object LibraryType.

//  Read the first XML file
TextReader tr = new StreamReader("Sample1.xml");

//  Deserialize the XML file into a LibraryType object
XmlSerializer serializer = new XmlSerializer(typeof(LibraryType));
LibraryType lib1 = (LibraryType)serializer.Deserialize(tr);

If you look at the object lib1 using the debugger, you’ll see that it’s properly loaded:

LibraryType Object
LibraryType Object

We can add a new book to this object and serialize it back to XML using the following code:

if (lib1.Books == null)
{
    lib1.Books = new List<BookType>();
}

BookType newBook = new BookType();
newBook.Title = "Book 3";
lib1.Books.Add(newBook);

//  Serialize back the library type object and output Xml
StringWriter sw = new StringWriter();
serializer.Serialize(sw, lib1);
Console.WriteLine("{0}:\n\n{1}", "Sample1.xml", sw.ToString());

The resulting XML would look something like:

<?xml version="1.0" encoding="utf-16"?>
<Library xmlns="http://schemas.ali.com/lib/">
  <Books>
    <Book Title="Book 1" Author="Ali" />
    <Book Title="Book 2" Author="Sara" />
    <Book Title="Book 3" />
  </Books>
  <Employees>
    <Employee Name="Ali" />
    <Employee Name="Sara" />
  </Employees>
</Library>

Tip 8 – Resolving Empty Lists Issue

Now, let’s try the same deserialization code above, but on XML file “Sample2.xml” which has no <Employees> tag. When we deserialize XML into an object then serialize back into XML, we get the following:

<?xml version="1.0" encoding="utf-16"?>
<Library xmlns="http://schemas.ali.com/lib/">
  <Books>
    <Book Title="Book 1" Author="Ali" />
    <Book Title="Book 2" Author="Sara" />
  </Books>
  <Employees />
</Library>

Notice the extra <Employees/> which we really didn’t intend to have in our XML. I guess the reason for this issue is because XmlSerializer is initializing all List<T> variables in the object on deserialization (verified that by watching the object in the debugger after deserialization) and thus we’ll get this empty tag <Employees/> when we serialize back. To resolve this issue, there are two workarounds. The first approach is definitely better, but you might find the other approach helpful based on your needs.

Approach 1

Don’t bother with the empty lists. Just clean up their corresponding empty XML tags on serialization. The method below takes a string representation of the XML and removes all empty tags.

/// <span class="code-SummaryComment"><summary>
</span>/// //////////Deletes empty Xml tags from the passed xml
/// <span class="code-SummaryComment"></summary>
</span>/// <span class="code-SummaryComment"><param name="xml"></param>
</span>/// <span class="code-SummaryComment"><returns></returns>
</span>public static string CleanEmptyTags(String xml)
{
    Regex regex = new Regex(@"(\s)*<(\w)*(\s)*/>");
    return regex.Replace(xml, string.Empty);
}

With the method above in mind, our Serialize method would change as follows:

public static string Serialize(XmlSerializer serializer,
                               Encoding encoding,
                               XmlSerializerNamespaces ns,
                               bool omitDeclaration,
                               object objectToSerialize)
{
    MemoryStream ms = new MemoryStream();
    XmlWriterSettings settings = new XmlWriterSettings();
    settings.Indent = true;
    settings.OmitXmlDeclaration = omitDeclaration;
    settings.Encoding = encoding;
    XmlWriter writer = XmlWriter.Create(ms, settings);
    serializer.Serialize(writer, objectToSerialize, ns);
    string xml = encoding.GetString(ms.ToArray());
    xml = CleanEmptyTags(xml);
    return xml;
}
Approach 2

Call the deserialize as usual then set any empty instantiated lists (Count == 0) to null. Here is the static method and its helper method that does the job.

/// <span class="code-SummaryComment"><summary>
</span>/// Deserializes the passed Xml then deallocates any instantiated and empty lists.
/// <span class="code-SummaryComment"></summary>
</span>/// <span class="code-SummaryComment"><param name="serializer"></param>
</span>/// <span class="code-SummaryComment"><param name="tr"></param>
</span>/// <span class="code-SummaryComment"><param name="objectNamespace"></param>
</span>/// <span class="code-SummaryComment"><returns></returns>
</span>public static object Deserialize(XmlSerializer serializer, 
			TextReader tr, string objectNamespace)
{
    //  Deserialize Xml into object
    object objectToReturn = serializer.Deserialize(tr);

    //  Clean up empty lists
    CleanUpEmptyLists(objectToReturn, objectNamespace);

    return objectToReturn;
}

/// <span class="code-SummaryComment"><summary>
</span>/// Sets any empty lists in the passed object to null. 
/// If the passed object itself is a list,
/// the method returns true of it's empty and false otherwise.
/// <span class="code-SummaryComment"></summary>
</span>/// <span class="code-SummaryComment"><param name="o"></param>
</span>/// <span class="code-SummaryComment"><param name="objectNamespace"></param>
</span>/// <span class="code-SummaryComment"><returns></returns>
</span>public static bool CleanUpEmptyLists(object o, string objectNamespace)
{
    //  Skip if the object is already null
    if (o == null)
    {
        return false;
    }

    //  Get the types of the object
    Type type = o.GetType();

    //  If this is an empty list, set it to null
    if (o is IList)
    {
        IList list = (IList)o;

        if (list.Count == 0)
        {
            return true;
        }
        else
        {
            foreach (object obj in list)
            {
                CleanUpEmptyLists(obj, objectNamespace);
            }
        }

        return false;
    }
    //  Ignore any objects that aren't in our namespace for perf reasons
    //  and to avoid getting errors on trying to get into every little detail
    else if (type.Namespace != objectNamespace)
    {
        return false;
    }

    //  Loop over all properties and handle them
    foreach (PropertyInfo property in type.GetProperties())
    {
        //  Get the property value and clean up any empty lists it contains
        object propertyValue = property.GetValue(o, null);
        if (CleanUpEmptyLists(propertyValue, objectNamespace))
        {
            property.SetValue(o, null, null);
        }
    }

    return false;
}

Using the above static method, you can now deserialize as follows:

//  Deserialize the Xml file into a LibraryType object
XmlSerializer serializer = new XmlSerializer(typeof(LibraryType));
LibraryType lib = 
	(LibraryType)Deserialize(serializer, tr, typeof(LibraryType).Namespace);

If you know of a better solution, please let me know.

Source

Download full source code here. You can use it to try out the above tips one by one.

Posted in .NET, csharp, XML Tagged: csharp, deserialize, empty, encoding, generic, Indent, List, namespace, OmitXmlDeclaration, serialize, XML, xmlserializer, XSD

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Ali BaderEddin
Software Developer Microsoft
United States United States
http://mycodelog.com/about/

Comments and Discussions

 
QuestionAwesome Article PinmemberPrabash_D7-Feb-13 20:18 
QuestionGREAT ARTICLE !!!! PinmemberShahjahanSajib19-Sep-12 4:32 
AnswerRe: GREAT ARTICLE !!!! PinmemberAli BaderEddin20-Sep-12 8:27 
GeneralThis is very good article. PinmemberJayesh Sorathia14-Sep-12 23:43 
GeneralRe: This is very good article. PinmemberAli BaderEddin18-Sep-12 8:59 
QuestionXML Serialization - Tips & Tricks - showing null elements PinmemberNancy359517-Feb-12 11:50 
Questionthanks for the tips but i still have some problem Pinmemberigalep13221-Jan-12 5:17 
Generalcs links not working Pinmemberuildriks3-Nov-10 4:23 
GeneralThanks for writing this. PinmemberWilliam Gorden6-Apr-10 11:36 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Terms of Use | Mobile
Web01 | 2.8.1411023.1 | Last Updated 3 Apr 2010
Article Copyright 2010 by Ali BaderEddin
Everything else Copyright © CodeProject, 1999-2014
Layout: fixed | fluid