Introduction
This articles describes a problem I came across with XmlSerializer
, and the solution I eventually found.
The ultimate solution is actually relatively simple but it took me nearly a full Thursday to figure out, so the article is more about how I got to the solution and the dead-ends on the way there.
Background
I wanted to be able to save a strongly-typed collection class to an XML file, but because of the way XmlSerializer
works (briefly described later) and the fact that one of the properties being serialized was a base class, I couldn't do this out of the box because I wanted to store not the base class itself but the derived classes.
I wasn't the first to find this limitation (it has come up a number of times on various forums), and most people seemed to have worked around this by writing custom code to read and write an XML file, but I wanted a simpler solution.
The Problem
First, I'll show you the three original classes I was working with:-
ViewInfoCollection
- a collection of ViewInfo
objects (no surprises there!). It is derived from ViewInfoCollectionBase
(an automatically generated collection) and provides additional methods to be able to save and load itself to/from a file. (I also have a static property to save the XmlSerializer
object so that it need only be created on first use, but now I know more about how XmlSerializer
works, I believe that it is cached internally by .NET anyway and so may be redundant.)
using System;
using System.IO;
using System.Xml.Serialization;
namespace Dashboard {
[Serializable]
public class ViewInfoCollection: ViewInfoCollectionBase {
#region Constructors
public ViewInfoCollection() {}
public ViewInfoCollection(int capacity): base(capacity) {}
public ViewInfoCollection(ViewInfoCollectionBase c): base(c){}
public ViewInfoCollection(ViewInfo[] a): base(a) {}
#endregion Constructors
#region Static
private static XmlSerializer Serializer {
get {
if (serializer == null) {
serializer = new XmlSerializer(typeof(ViewInfoCollection));
}
return serializer;
}
} static XmlSerializer serializer;
public static ViewInfoCollection FromXmlFile(string filename) {
ViewInfoCollection @new = new ViewInfoCollection();
@new.ReadFromXml(filename);
return @new;
}
#endregion Static
#region Methods
public void WriteToXml(string filename) {
using(StreamWriter writer = new StreamWriter(filename)) {
Serializer.Serialize(writer, this);
}
}
public void ReadFromXml(string filename) { ReadFromXml(filename, false); }
public void ReadFromXml(string filename, bool preserveItems) {
if (preserveItems == false) Clear();
using(StreamReader reader = new StreamReader(filename)) {
AddRange( (ViewInfoCollection) Serializer.Deserialize(reader));
}
}
#endregion Methods
}
}
ViewInfo
- this is the class that is contained in the collection. Nothing special here, but watch out for the last property - Parameters
- although it looks innocent enough, it is the cause of all my problems.
using System;
using System.Xml.Serialization;
namespace Dashboard {
[Serializable]
public class ViewInfo {
public string Name {
get { return name; }
set { name = value; }
} string name;
public string Category {
get {
return category;
}
set { category = value; }
} string category;
public string ServiceProvider {
get { return serviceProvider; }
set { serviceProvider = value; }
} string serviceProvider;
public bool IsWellKnown {
get { return isWellKnown; }
set { isWellKnown = value; }
} bool isWellKnown;
public string FormType {
get { return formType; }
set { formType = value; }
} string formType;
public string[] AlternativeFormTypes {
get { return alternativeFormTypes; }
set { alternativeFormTypes = value; }
} string[] alternativeFormTypes;
public object UniqueID {
get {
if (uniqueID == null)
return Name;
else {
return uniqueID;
}
}
set { uniqueID = value; }
} object uniqueID;
public DashboardParams Parameters {
get { return parameters; }
set { parameters = value; }
} DashboardParams parameters;
public override string ToString() {
return string.Format("Name={0}, IsWellKnown={1}, " +
"UniqueID={2}, FormType={3}, AlternativeFormTypes={4}",
Name, IsWellKnown, UniqueID, FormType,
AlternativeFormTypes == null ? "(none)" : string.Join("; ",
AlternativeFormTypes));
}
}
}
DashboardParams
- this class happens to be abstract although the problems I had would be the same if it wasn't. It is a base class intended to be overridden by any number of classes that hold parameter information. It is marked as Serializable
(as are the other two classes) because it will be passed to other processes on other machines via remoting) but this is not relevant for this article. It is the Type used in the Parameter
property of ViewInfo
and is used to provide a base for concrete classes such as DataWatcherParams
(not listed here), which adds a few more properties and is mentioned later in the article.
using System;
using System.Xml.Serialization;
namespace Dashboard {
[Serializable]
public abstract class DashboardParams: IDashboardParams {
#region Properties
[XmlIgnore]
public ClientToken Token {
get { return token; }
} ClientToken token = ClientToken.Instance;
[XmlIgnore]
public DashboardMessageHandler MessageHandler {
get { return messageHandler; }
set { messageHandler = value; }
} DashboardMessageHandler messageHandler;
#endregion Properties
public string DisplayName {
get { return displayName; }
set { displayName = value; }
} string displayName;
public virtual bool IsValid {
get { return messageHandler != null; }
}
public virtual string UniqueID {
get {
return Guid.NewGuid().ToString();
}
}
}
}
I tried saving a test collection which contains two ViewInfo
objects, one has an instance of a DataWatcherParams
as its Parameters
property, and the other has a null Parameters
property.... and got the following exception :-
Unhandled Exception: System.InvalidOperationException:
There was an error generating the XML document. --->
System.InvalidOperationException:
The type Dashboard.DataWatchServices.DataWatcherParams was not expected.
Use the XmlInclude or SoapInclude attribute to specify types
that are not known statically."
The Solution
Not knowing much about XmlSerializer
apart from the basics, I Googled. And then I Googled some more. And I came up with the following observations.
.NET has two independent serialization paradigms designed for completely different serialization scenarios:-
- The first uses
BinaryFormatter
or SoapFormatter
(or any class implementing IFormatter
) and is intended to put the contents of any serializable class into a stream which can be saved or transported and then used to recreate the original object in its entirety. It uses attributes such as Serializable
to control serializable and, most importantly, serializes the private
fields within the class.
- The second and the one we're interested in,
XmlSerializer
, is intended simply to map fields and properties of a class to an XML document and vice versa. It is completely independent of the 'other' serializer and has its own attributes for control. The most important difference is that it only looks at public
read/write properties and methods. It also has special support for types that implement ICollection
and serializes the contents of the collection as nested elements.
XmlSerializer
works by generating an on-the-fly assembly (with a random name) that knows how to serialize/deserialize the type passed to it in its constructor. During the generation of this assembly, it looks for public
read/write properties within the type, checks that other types involved have a constructor that takes no parameters (otherwise, it wouldn't be able to recreate the object during deserialization!), and builds a list of types that it needs to know about to perform the serialization/deserialization.
And herein lies the problem. If during serialization, it finds a class that is not part of the type list it built whilst generating the serializer, it throws the exception listed above.
In my scenario, the generated serializer knew about the DashboardParams
type of the Parameters
property, and knew that it could serialize/deserialize it, but when it actually came to serialize my test collection, the DataWatcherParams
type was not in the list and so the exception was thrown.
Using the XmlInclude
attribute (as recommended in the exception description) does work (basically, it manually adds a Type to the list discovered during the generation phase), but it means that I need to know at compile-time all of the classes derived from DashboardParams
. Not a viable solution for my scenario.
The next step to investigate was the options available in the constructor. It is possible to specify a list of types there, but although that eliminates the problem of knowing the derived types at compile-time, I would need to maintain a list and get any new class to 'register' with that list. Definitely over the top.
I then looked at the other XML attributes available for controlling XML serialization. A promising option was the Type property on the XmlElement
attribute that allows a derived type to be specified. Exactly what I wanted to do, but again, it relies on knowing the possible types at compile-time or maintaining a list to use at runtime.
Then I discovered IXmlSerializable
!!!
This allows full control of the XML serialization process and allows a type to put any information it likes into the XML document, an example being DataSet
.
Strictly speaking, it is for internal use only in v1.1 but is documented in v2.0. Anyway, if it's good enough for a DataSet
, it's good enough for my ViewInfo
class!
The IXmlSerializable
interface has three methods:
public XmlSchema GetSchema();
public void ReadXml(XmlReader reader);
public void WriteXml(XmlWriter writer);
Since we don't need a schema, I guessed (correctly as it turned out) that returning a null
in GetSchema()
would be acceptable. That just left ReadXml
and WriteXml
needing to be written. The methods supply an XmlReader
object and an XmlWriter
object respectively, and all I needed to do was insert/extract the elements for my ViewInfo
class.
Then I realized that there is actually a lot of complicated reflection work going inside the generated serializer. Although I could serialize simple properties such as string
Name
and bool
IsWellKnown
by using their. ToString()
methods and then use .Parse
() to recreate the value, object
UniqueID
would be somewhat more complicated since I would have to interrogate its type using reflection and then add an attribute. Parsing on deserializable would then get very complicated! Another problem was that if I (or another developer) later decided to derive from ViewInfo
, I would also need to re-implement these methods for any new properties.
All I really wanted to do was get control of serializing Parameters
and let normal serialization take care of the rest, but I couldn't do this simply - IXmlSerializable
is an all or nothing solution.
Then I remembered that during my Googling session, I saw a solution to a custom XML Serialization problem that, although not applicable to my problem, gave me another idea. (I can't find the original source now, but thanks to that guy anyway!). His problem was that Color
didn't serialize correctly, and he got around the problem by putting an XmlIgnore
attribute on his Color
property and creating a new property called XmlColor
which contained a new class called SerializeColor
. So XmlSerializer
, instead of serializing Color
, serialized an instance of SerializeColor
which was a class over which he had complete control.
I came up with this:-
using System;
using System.Xml;
using System.Xml.Schema;
using System.Xml.Serialization;
namespace Dashboard {
public class DashboardParamsSerializer: IXmlSerializable {
#region Constructors
public DashboardParamsSerializer() {}
public DashboardParamsSerializer(DashboardParams parameters) {
this.parameters = parameters;
}
#endregion Constructors
#region Properties
public DashboardParams Parameters {
get { return parameters; }
} DashboardParams parameters;
#endregion Properties
#region IXmlSerializable Implementation
public XmlSchema GetSchema() {
return null;
}
public void ReadXml(XmlReader reader) {
Type type = Type.GetType(reader.GetAttribute("type"));
reader.ReadStartElement();
this.parameters = (DashboardParams) new
XmlSerializer(type).Deserialize(reader);
reader.ReadEndElement();
}
public void WriteXml(XmlWriter writer) {
writer.WriteAttributeString("type", parameters.GetType().ToString());
new XmlSerializer(parameters.GetType()).Serialize(writer, parameters);
}
#endregion IXmlSerializable Implementation
}
}
and I modified my ViewInfo
class as follows:-
[XmlIgnore]
public DashboardParams Parameters {
get { return parameters; }
set { parameters = value; }
} DashboardParams parameters;
[XmlElement("Parameters")]
public DashboardParamsSerializer XmlParameters {
get {
if (Parameters == null)
return null;
else {
return new DashboardParamsSerializer(Parameters);
}
}
set {
parameters = value.Parameters;
}
}
(The XmlElement("Parameters")
is just sugar so that the correct element name is written into the XML file rather than "XmlParameters").
What is happening here is that the Parameters
property is now ignored by the serializer but XmlParameters
is serialized instead. The serializer comes along, sees that XmlParameters
is a read/write property of type DashboardParamsSerializer
, and asks for its value. The XmlParameters
property getter method creates a temporary DashboardParamsSerializer
object passing the original Parameters
value in the constructor. (Null values are ignored by XmlSerializer
anyway, so there is no need for special handling code).
Because DashboardParamsSerializer
implements IXmlSerializable
, the serializer
calls its WriteXml
method, and this gives us the opportunity to add an attribute into the current element and store the actual Type into it. It then passes this type to a new XmlSerializer
object which can then serialize the object as it would normally straight into the XmlWriter
object - no need for any reflection on my part.
Deserialize works in reverse. The serializer will call DashboardParamsSerializer.ReadXml
with the XmlWriter
located at the correct place in the XML file. The method then reads the attribute it placed there originally and creates a new XmlSerializer
to create the new object. XmlSerializer
then passes the new object to the XmlParameters
property setters. The real object is then stored in the parameters
private
field.
BINGO! It worked a treat!
The only fly in the ointment was that I now had an extra public
property in ViewInfo
that I didn't really want (it had to be public
, otherwise XmlSerializer
would ignore it).
Then I had an Epiphany. Remember that I mentioned that the XmlElement
attribute had a Type property to specify a derived type? Well, I wondered whether it really needed to be a derived object or whether XmlSerializer
was just casting to it.
I added this section to DashboardParamsSerializer
:-
#region Static
public static implicit operator DashboardParamsSerializer(DashboardParams p) {
return p == null ? null : new DashboardParamsSerializer(p);
}
public static implicit operator DashboardParams(DashboardParamsSerializer p) {
return p == null ? null : p.Parameters;
}
#endregion Static
changed the attribute on the Parameters
property to:-
[XmlElement(Type=typeof(DashboardParamsSerializer))]
and deleted the XmlParameters
property completely.
If XmlSerializer
was simply casting to the new type rather than explicitly checking that it was a derived type, then the implicit overload methods would silently convert between DashboardParamsSerializer
and DashboardParams
(and any derivation of it). It worked!!
So, I now had a single extra class and a single attribute that would allow me to serialize any class derived from DashboardParams
. This was the solution for my needs but remember that I previously said that the XmlSerializer
has constructor overrides to allow attributes to be specified? Microsoft did this with a view to being able to put XML attributes on classes for which the source code is not available.
So, I tested my solution to its logical conclusion and removed all customization from ViewInfo
- leaving it exactly as it was originally. Instead, I've made the attributes 'virtual' attributes, and told the XmlSerializer
to use them as though they were on the target object.
I changed the Serializer
property in ViewInfoCollection
to do this as follows:-
private static XmlSerializer Serializer {
get {
if (serializer == null) {
XmlAttributeOverrides attributeOverrides =
new XmlAttributeOverrides();
XmlAttributes attributes = new XmlAttributes();
XmlElementAttribute attribute = new
XmlElementAttribute(typeof(DashboardParamsSerializer));
attributes.XmlElements.Add(attribute);
attributeOverrides.Add(typeof(ViewInfo),
"Parameters", attributes);
serializer = new XmlSerializer(typeof(ViewInfoCollection),
attributeOverrides);
}
return serializer;
}
} static XmlSerializer serializer;
Again, this worked as expected!
What we are doing here is telling the serializer that if it should come across a property or method called "Parameters
" in a ViewInfo
Type, then pretend it had an XmlElement
attribute on it created with its Type property set to typeof(DashboardParamsSerializer)
.
Summary
So, now we have a way of being able to XmlSerialize an object that contains a read/write property (or public
field) which holds a derived class, and we only know the base type at compile time.
Here is the final DashboardParamsSerializer
class:-
using System;
using System.Xml;
using System.Xml.Schema;
using System.Xml.Serialization;
namespace Dashboard {
public class DashboardParamsSerializer: IXmlSerializable {
#region Static
public static implicit operator
DashboardParamsSerializer(DashboardParams p) {
return p == null ? null : new DashboardParamsSerializer(p);
}
public static implicit operator
DashboardParams(DashboardParamsSerializer p) {
return p == null ? null : p.Parameters;
}
#endregion Static
#region Constructors
public DashboardParamsSerializer() {}
public DashboardParamsSerializer(DashboardParams parameters) {
this.parameters = parameters;
}
#endregion Constructors
#region Properties
public DashboardParams Parameters {
get { return parameters; }
} DashboardParams parameters;
#endregion Properties
#region IXmlSerializable Implementation
public XmlSchema GetSchema() {
return null;
}
public void ReadXml(XmlReader reader) {
Type type = Type.GetType(reader.GetAttribute("type"));
reader.ReadStartElement();
this.parameters = (DashboardParams) new
XmlSerializer(type).Deserialize(reader);
reader.ReadEndElement();
}
public void WriteXml(XmlWriter writer) {
writer.WriteAttributeString("type", parameters.GetType().ToString());
new XmlSerializer(parameters.GetType()).Serialize(writer, parameters);
}
#endregion IXmlSerializable Implementation
}
}
- Copy the above code into a new class.
- Search and replace in the new class: replace "
DashboardParams
" with "<newClassName>" and replace "DashboardParamsSerializer
" with "<newClassName>Serializer".
- Do one of the following to supply an attribute so that
XmlSerializer
will know to use your custom serializer class:-
- Either add an attribute directly onto the base class property if you have access to the source code:-
[XmlElement(Type=typeof(<newClassName>Serializer))]
- or pass a 'virtual attribute' to the constructor of
XmlSerializer
if you don't have the source code:- XmlAttributeOverrides attributeOverrides =
new XmlAttributeOverrides();
attributes.XmlElements.Add(new
XmlElementAttribute(typeof(<newClassName>Serializer)));
attributeOverrides.Add(typeof(<typeWithAPropertyHoldingABaseType>),
"<nameOfPropertyHoldingABaseType>", attributes);
serializer = new
XmlSerializer(typeof(<anyTypeThatIndirectlyReferences
TypeWithAPropertyHoldingABaseType>),
attributeOverrides);
Addendum
Another buglet that I spotted during later testing was that the UniqueID
property returns the value of the Name
property if no specific UniqueID
had been set. Standard stuff and not a problem for normal serialization since that stores the private
value, but it is a problem for XML serialization since it will serialize whatever UniqueID
returns and not its underlying value. Luckily, XmlSerializer
follows the convention for 'normal' serialization, and will check any DefaultValue
attribute and call ShouldSerialize<target>
for a final determination on whether to serialize or not. The following line fixes the problem:-
public bool ShouldSerializeUniqueID() { return uniqueID != null; }