Introduction
Moving from Microsoft Foundation Classes (MFC) to .NET, Microsoft greatly simplified the process of serialization - that is storing object properties to a file or other stream destination. In .NET, making your objects serializable requires no more than adding the [Serializable]
attribute to the class - both for binary files and for XML. In MFC, making your objects serializable required actually implement the function to write values to the stream. XML is not even supported. Versioning of objects is also simplified in .NET. Versioning is done automatically, and works as long as you don't change property names or remove properties. MFC had a built in versioning mechanism, but it was not trustworthy, and experienced MFC programmers learned to implement their own versioning mechanism.
If .NET Serialization is so easy, then why go through the trouble to implement MFC-style serialization?
- One reason is speed. Serialization in .NET relies on reflection which reduces performance. This project includes a test application which shows how for large numbers of objects in a complex hierarchy, MFC style serialization implemented here can be as much as ten times faster than standard .NET serialization.
- The second reason is control. MFC-style serialization gives you much greater control over the serialization process. (Actually, .NET also gives you full control over the process, but in a different way).
- The third reason, and much more compelling, is that this project forms the basis of Part 2, where we use .NET to read objects serialized using MFC, including proper conversion of
CString
, COleDateTime
, and COleCurrency
to .NET types. This is important functionality for anyone planning on migrating an MFC application to .NET, and perhaps something MS should have provided from the days of .NET 1.1.
MFC serialization itself is a fairly complex topic - at least if you want to make it work reliably in a production environment. I imagine that the only programmers interested in this method are those already familiar with MFC serialization. Therefore, I will assume such knowledge as prerequisite.
The Archive Class
The heart of MFC serialization is the CArchive
class. Any object that is to be serialized has some function that looks like:
void CSomeObject::Serialize(CArchive& ar)
This function uses the CArchive
parameter to actually write member values to the stream.
For the .NET implementation, the logical place to start is to create a similar class called Archive
. This class accepts a System.IO.Stream
object as a parameter to the constructor. Then using read/write functions for different types, reads/writes the datum to/from the stream.
Here is code for the Archive
class.
public enum ArchiveOp
{
load = 0,
store = 1
}
public class Archive
{
protected BinaryWriter writer = null;
protected BinaryReader reader = null;
protected ArchiveOp op = ArchiveOp.load;
private const int m_Index = 0;
public Archive(Stream _stream, ArchiveOp _op)
{
op = _op;
if (_op == ArchiveOp.load)
{
reader = new BinaryReader(_stream);
}
else
{
writer = new BinaryWriter(_stream);
}
}
public bool IsStoring()
{
if (op == ArchiveOp.store) return true;
return false;
}
public void Serialize(IArchiveSerialization obj)
{
obj.Serialize(this);
}
public void Write(Char ch)
{
writer.Write(Convert.ToInt16(ch));
}
public void Write(UInt16 n)
{
writer.Write(n);
}
public void Write(Int16 n)
{
writer.Write(n);
}
public void Write(UInt32 n)
{
writer.Write(n);
}
public void Write(Int32 n)
{
writer.Write(n);
}
public void Write(UInt64 n)
{
writer.Write(n);
}
public void Write(Int64 n)
{
writer.Write(n);
}
public void Write(Single d)
{
writer.Write(d);
}
public void Write(Double d)
{
writer.Write(d);
}
public void Write(Decimal d)
{
Int64 n = Decimal.ToOACurrency(d);
writer.Write(n);
}
public void Write(DateTime dt)
{
writer.Write(dt.ToBinary());
}
public void Write(Boolean b)
{
writer.Write(b);
}
public void Write(string s)
{
writer.Write(Convert.ToInt32(s.Length));
writer.Write(s.ToCharArray());
}
public void Write(Guid guid)
{
byte[] bytes = guid.ToByteArray();
Write(bytes);
}
public void Write(Byte[] buffer)
{
writer.Write(buffer);
}
public void Read(out string s)
{
Int32 length = 0;
Read(out length);
char[] ch = new char[length];
reader.Read(ch, m_Index, length);
StringBuilder sb = new StringBuilder();
sb.Append(ch);
s = sb.ToString();
}
public void Read(out UInt16 n)
{
byte[] bytes = new byte[2];
reader.Read(bytes, m_Index, 2);
n = BitConverter.ToUInt16(bytes, 0);
}
public void Read(out Int16 n)
{
byte[] bytes = new byte[2];
reader.Read(bytes, m_Index, 2);
n = BitConverter.ToInt16(bytes, 0);
}
public void Read(out UInt32 n)
{
byte[] bytes = new byte[4];
reader.Read(bytes, m_Index, 4);
n = BitConverter.ToUInt32(bytes, 0);
}
public void Read(out Int32 n)
{
byte[] bytes = new byte[4];
reader.Read(bytes, m_Index, 4);
n = BitConverter.ToInt32(bytes, 0);
}
public void Read(out UInt64 n)
{
byte[] bytes = new byte[8];
reader.Read(bytes, m_Index, 8);
n = BitConverter.ToUInt64(bytes, 0);
}
public void Read(out Int64 n)
{
byte[] bytes = new byte[8];
reader.Read(bytes, m_Index, 8);
n = BitConverter.ToInt64(bytes, 0);
}
public void Read(out Char ch)
{
Int16 n;
Read(out n);
ch = Convert.ToChar(n);
}
public void Read(out float d)
{
byte[] bytes = new byte[4];
reader.Read(bytes, m_Index, 4);
d = BitConverter.ToSingle(bytes, 0);
}
public void Read(out double d)
{
byte[] bytes = new byte[8];
reader.Read(bytes, m_Index, 8);
d = BitConverter.ToDouble(bytes, 0);
}
public void Read(out Decimal d)
{
byte[] bytes = new byte[8];
reader.Read(bytes, m_Index, 8);
Int64 n = BitConverter.ToInt64(bytes, 0);
d = Decimal.FromOACurrency(n);
}
public void Read(out DateTime dt)
{
Int64 l;
Read(out l);
dt = DateTime.FromBinary(l);
}
public void Read(out Boolean b)
{
byte[] bytes = new byte[1];
reader.Read(bytes, m_Index, 1);
b = BitConverter.ToBoolean(bytes, 0);
}
public void Read(out Guid guid)
{
byte[] bytes = new byte[16];
Read(out bytes, 16);
guid = new Guid(bytes);
}
public void Read(out byte[] buffer, int bufferSize)
{
buffer = new byte[bufferSize];
reader.Read(buffer, m_Index, bufferSize);
}
}
In MFC, the CArchive
class overloaded the << and >> operators providing a convenient shorthand for reading and writing to the stream. However, .NET does not allow overriding those operators in that way (at least, I couldn't figure out how to do it), so instead, we have to implement functions called Read
and Write
to do the processing. Note that there is a Read
function and a Write
function for each of data type that can be processed.
The IArchiveSerialization Interface
In addition to the Archive
class,I implemented an IArchiveSerialization
interface. You can implement this interface on any class for which you want to support this style of serialization. Although, it is not required, it is convenient for hierarchical object relationships as demonstrated in the example project.
public interface IArchiveSerialization
{
void Serialize(Archive ar);
}
The VersionException Class
In this type of serialization, it is up to the programmer to control versioning. To assist in this, I created the VersionException
class. This exception is thrown if the application is trying to serialize in an object created in a later version of the program, such as trying to open an Excel 2003 workbook in Excel 95.
Using the Archive Class
To demonstrate how MFC style serialization works, I created a class Person
with various properties - name, age, weight, etc - to illustrate and test all of the different data types. The Person
class also has as a property, a list of Person
objects - the children for the person. This is to demonstrate how to use MFC style serialization in a hierarchy of objects - something that .NET does automatically. Also, my Person
example shows how serialization would work if there were two different versions of the object.
public class Person : IArchiveSerialization
{
public string Name;
public int Age;
public double Weight;
public float Height;
public DateTime Birthday;
public Char Sex;
public bool Deceased;
public Guid guid;
public decimal AccountBalance;
public List<Person> Children;
public Person()
{
Children = new List<Person>();
guid = Guid.NewGuid();
}
public void WriteToConsole()
{
Console.WriteLine("Name: " + Name);
Console.WriteLine("Guid: " + guid.ToString());
Console.WriteLine("Age: " + Age);
Console.WriteLine("Weight: " + Weight);
Console.WriteLine("Height: " + Height);
Console.WriteLine("Birthday: " + Birthday);
Console.WriteLine("Sex: " + Sex);
Console.WriteLine("Deceased: " + Deceased);
Console.WriteLine("Account Balance: " + AccountBalance);
Console.WriteLine("{0} has {1} children.", Name,
Children.Count);
foreach (Person child in Children)
{
child.WriteToConsole();
Console.WriteLine();
}
}
virtual public void Serialize(Archive ar)
{
UInt32 version = 0x00000001;
if (ar.IsStoring())
{
ar.Write(version);
ar.Write(guid);
ar.Write(Sex);
ar.Write(Age);
ar.Write(Name);
ar.Write(Weight);
int nChildren = Children.Count;
ar.Write(nChildren);
foreach (Person p in Children)
{
ar.Serialize(p);
}
ar.Write(Height);
ar.Write(Birthday);
ar.Write(Deceased);
}
else
{
ar.Read(out version);
if (version > 0x00000001)
{
throw new VersionException();
}
ar.Read(out guid);
ar.Read(out Sex);
ar.Read(out Age);
ar.Read(out Name);
ar.Read(out Weight);
Children.Clear();
int nMaxChildren;
ar.Read(out nMaxChildren);
for (int n = 0; n < nMaxChildren; ++n)
{
Person child = new Person();
ar.Serialize(child);
Children.Add(child);
}
if (version > 0x00000000)
{
ar.Read(out Height);
ar.Read(out Birthday);
ar.Read(out Deceased);
}
}
}
}
Important Points and Differences from .NET
- Note that the versioning mechanism is completely manual. For every class that will be serialized, you should first serialize a version flag. With each change in the version, the flag is increased by 1. Properties Height, Birthday, and Deceased are in the second version (version 0x00000001) of the object.
- Unlike with .NET, order of serialization is important. Properties must be loaded in the same order they are written. Once the order is established, it cannot be changed without creating a new version.
- In .NET, you cannot change a property name once you have begun using the application in production (well, you can, but it requires more complex code changes). In MFC style serialization, you can change property names without changing the serialization version (remember that MFC does not think of properties the way .NET thinks of properties). However, changing their type requires changes to the serialization version.
- I believe that .NET only serializes members explicityly created as properties (with get/set). If a class member is not exposed a property, it does not get serialized. Using MFC-style serialization, you explicitly code which members get serialized, so it does not matter if they are .NET properties (with get/set) or not.
Nullable Types
In this particular class, I did not support nullable types. Nullable types are a problem for this type of serialization. The best solution is to include with each nullable type, a flag indicating if the object is null. This should be built into the Archive
class. This is how MFC serializes objects such as COleDateTime
. COleDateTime
includes a 4-byte status value - which indicates null or invalid, and 8 bytes for the actual date.
The Example Project
The example project includes a simple console application that writes a hierarchy of Person
objects to a file and reads them back again. The class also demonstrates how to use this type of serialization with a memory stream rather than a file stream.
The File System Hierarchy Test Application
Because MFC-style serialization writes properties directly to the stream without using reflection, one would expect it to perform better than .NET serialization - particularly with large hierarchies of many objects. But how much better? I wanted to find out. So what is the easiest way to build a complex hierarchy with many objects? From your computer's file system of course.
To test performance, I wrote a WinForms application. The app reads the file system and creates a hierarchy of file objects. The FileObject
stores information such as the file name, extension, and date created. Any file object that is a directory, contains a list of subdirectories and files. This allows me to quickly build a hierarchical list with thousands of objects. The application is included with the project and you can use it on your own file system to test performance.
Once you have built the heirarchy of the file system, you can serialize the hierarchy using .NET binary serialization, .NET XML serialization, and MFC-style serialization. The application will tell you how long each operation took.
Performance Results
I used the test application to time serialization of file objects created from my C Drive. There were a total of 196897 objects in the hierarchy. The performance results are shown below (time in milliseconds):
Serialization Type |
Store |
Load |
Serialized File Size |
.NET Binary Serialization |
2444 MS |
6191 MS |
88,422 KB |
.NET XML Serialization |
2381 MS |
1940 MS |
17,042 KB |
MFC Style Serialization (in .NET) |
233 MS |
603 MS |
9,885 KB |
As I expected, MFC-style serialization performed much better than .NET serialization. For .NET binary serialization, an object graph of this size took more than six seconds to load - a long time for an application to be unresponsive. Surprisingly, XML serialization was three times faster than .NET binary serialization.
Also, as expected, the XML file was much larger than either of the binary files. The MFC style binary file was about half the size of the .NET binary file.
Conclusions
Implementing MFC style serialization allows you to take control over your object serialization and does provide a notable performance boost for very large object graphs. For those of you comfortable with the MFC style of serialization, it may have use in your projects.
However, the real benefit is that this method serves as the basis for being able to read into your .NET application data files actually serialized using MFC. That is described in Part 2.