Implementing MFC-Style Serialization in .NET - Part 1

Robert Pittenger, MCPD-EAD

4.50/5 (9 votes)

Jan 21, 2009

CPOL

7 min read

51806

457

This article shows how to implement MFC-style object serialization in .NET.

Download .NET Demo Project (VS 2005) - 214.97 KB

Introduction

Moving from Microsoft Foundation Classes (MFC) to .NET, Microsoft greatly simplified the process of serialization - that is storing object properties to a file or other stream destination. In .NET, making your objects serializable requires no more than adding the [Serializable] attribute to the class - both for binary files and for XML. In MFC, making your objects serializable required actually implement the function to write values to the stream. XML is not even supported. Versioning of objects is also simplified in .NET. Versioning is done automatically, and works as long as you don't change property names or remove properties. MFC had a built in versioning mechanism, but it was not trustworthy, and experienced MFC programmers learned to implement their own versioning mechanism.

If .NET Serialization is so easy, then why go through the trouble to implement MFC-style serialization?

One reason is speed. Serialization in .NET relies on reflection which reduces performance. This project includes a test application which shows how for large numbers of objects in a complex hierarchy, MFC style serialization implemented here can be as much as ten times faster than standard .NET serialization.
The second reason is control. MFC-style serialization gives you much greater control over the serialization process. (Actually, .NET also gives you full control over the process, but in a different way).
The third reason, and much more compelling, is that this project forms the basis of Part 2, where we use .NET to read objects serialized using MFC, including proper conversion of CString, COleDateTime, and COleCurrency to .NET types. This is important functionality for anyone planning on migrating an MFC application to .NET, and perhaps something MS should have provided from the days of .NET 1.1.

MFC serialization itself is a fairly complex topic - at least if you want to make it work reliably in a production environment. I imagine that the only programmers interested in this method are those already familiar with MFC serialization. Therefore, I will assume such knowledge as prerequisite.

The Archive Class

The heart of MFC serialization is the CArchive class. Any object that is to be serialized has some function that looks like:

void CSomeObject::Serialize(CArchive& ar)

This function uses the CArchive parameter to actually write member values to the stream.

For the .NET implementation, the logical place to start is to create a similar class called Archive. This class accepts a System.IO.Stream object as a parameter to the constructor. Then using read/write functions for different types, reads/writes the datum to/from the stream.

Here is code for the Archive class.

	public enum ArchiveOp
	{
		load = 0,
		store = 1
	}
	
	public class Archive
	{
		protected BinaryWriter writer = null;
		protected BinaryReader reader = null;
		protected ArchiveOp op = ArchiveOp.load;
		private const int m_Index = 0; // actually never changes
		
		public Archive(Stream _stream, ArchiveOp _op)
		{
			op = _op;
			
			if (_op == ArchiveOp.load)
			{
				reader = new BinaryReader(_stream);
			}
			else
			{
				writer = new BinaryWriter(_stream);
			}
		}

		public bool IsStoring()
		{
			if (op == ArchiveOp.store) return true;
			return false;
		}

		public void Serialize(IArchiveSerialization obj)
		{
			obj.Serialize(this);
		}

		//////////////////////////////////////////////////////
		// write functions

		public void Write(Char ch)
		{
			//writer.Write(ch);
			writer.Write(Convert.ToInt16(ch));
		}

		public void Write(UInt16 n)
		{
			writer.Write(n);
		}

		public void Write(Int16 n)
		{
			writer.Write(n);
		}

		public void Write(UInt32 n)
		{
			writer.Write(n);
		}

		public void Write(Int32 n)
		{
			writer.Write(n);
		}

		public void Write(UInt64 n)
		{
			writer.Write(n);
		}

		public void Write(Int64 n)
		{
			writer.Write(n);
		}

		public void Write(Single d)
		{
			writer.Write(d);
		}

		public void Write(Double d)
		{
			writer.Write(d);
		}

		public void Write(Decimal d)
		{
			// store decimals as Int64
			Int64 n = Decimal.ToOACurrency(d);
			writer.Write(n);
		}

		public void Write(DateTime dt)
		{
			writer.Write(dt.ToBinary());
		}

		public void Write(Boolean b)
		{
			writer.Write(b);
		}

		public void Write(string s)
		{
			writer.Write(Convert.ToInt32(s.Length));
			writer.Write(s.ToCharArray());
		}

		public void Write(Guid guid)
		{
			byte[] bytes = guid.ToByteArray();
			Write(bytes);
		}

		public void Write(Byte[] buffer)
		{
			writer.Write(buffer);
		}

		///////////////////////////////////////////////////
		// Read functions

		public void Read(out string s)
		{
			Int32 length = 0;
			Read(out length);

			char[] ch = new char[length];

			reader.Read(ch, m_Index, length);

			StringBuilder sb = new StringBuilder();
			sb.Append(ch);
			s = sb.ToString();
		}

		public void Read(out UInt16 n)
		{
			byte[] bytes = new byte[2];
			reader.Read(bytes, m_Index, 2);
			n = BitConverter.ToUInt16(bytes, 0);
		}

		public void Read(out Int16 n)
		{
			byte[] bytes = new byte[2];
			reader.Read(bytes, m_Index, 2);
			n = BitConverter.ToInt16(bytes, 0);
		}

		public void Read(out UInt32 n)
		{
			byte[] bytes = new byte[4];
			reader.Read(bytes, m_Index, 4);
			n = BitConverter.ToUInt32(bytes, 0);
		}

		public void Read(out Int32 n)
		{
			byte[] bytes = new byte[4];
			reader.Read(bytes, m_Index, 4);
			n = BitConverter.ToInt32(bytes, 0);
		}

		public void Read(out UInt64 n)
		{
			byte[] bytes = new byte[8];
			reader.Read(bytes, m_Index, 8);
			n = BitConverter.ToUInt64(bytes, 0);
		}

		public void Read(out Int64 n)
		{
			byte[] bytes = new byte[8];
			reader.Read(bytes, m_Index, 8);
			n = BitConverter.ToInt64(bytes, 0);
		}

		public void Read(out Char ch)
		{
			Int16 n;
			Read(out n);
			ch = Convert.ToChar(n);
			
			/* direct reading as char doesn't work for some reason
				Sometimes it works, but sometimes the character
			  takes up only one byte in the buffer and it seems
			  to depend on what comes before and after the item in the
                              buffer
		 
			*/

//			byte[] bytes = new byte[2];
//			reader.Read(bytes, m_Index, 2);
//			ch = BitConverter.ToChar(bytes, 0);
		}

		public void Read(out float d)
		{
			byte[] bytes = new byte[4];
			reader.Read(bytes, m_Index, 4);
			d = BitConverter.ToSingle(bytes, 0);
		}

		public void Read(out double d)
		{
			byte[] bytes = new byte[8];
			reader.Read(bytes, m_Index, 8);
			d = BitConverter.ToDouble(bytes, 0);
		}

		public void Read(out Decimal d)
		{
			byte[] bytes = new byte[8];
			reader.Read(bytes, m_Index, 8);

			// BitConverter does not support direct conversion to
                            // Decimal so use Int64
			Int64 n = BitConverter.ToInt64(bytes, 0);
			d = Decimal.FromOACurrency(n);
		}

		public void Read(out DateTime dt)
		{
			Int64 l;
			Read(out l);
			dt = DateTime.FromBinary(l);
		}

		public void Read(out Boolean b)
		{
			byte[] bytes = new byte[1];
			reader.Read(bytes, m_Index, 1);
			b = BitConverter.ToBoolean(bytes, 0);
		}

		public void Read(out Guid guid)
		{
			byte[] bytes = new byte[16];
			Read(out bytes, 16);
			guid = new Guid(bytes);
		}

		public void Read(out byte[] buffer, int bufferSize)
		{
			buffer = new byte[bufferSize];
			reader.Read(buffer, m_Index, bufferSize);
		}
	} // end of class

In MFC, the CArchive class overloaded the << and >> operators providing a convenient shorthand for reading and writing to the stream. However, .NET does not allow overriding those operators in that way (at least, I couldn't figure out how to do it), so instead, we have to implement functions called Read and Write to do the processing. Note that there is a Read function and a Write function for each of data type that can be processed.

The IArchiveSerialization Interface

In addition to the Archive class,I implemented an IArchiveSerialization interface. You can implement this interface on any class for which you want to support this style of serialization. Although, it is not required, it is convenient for hierarchical object relationships as demonstrated in the example project.

public interface IArchiveSerialization
{
	void Serialize(Archive ar);
}

The VersionException Class

In this type of serialization, it is up to the programmer to control versioning. To assist in this, I created the VersionException class. This exception is thrown if the application is trying to serialize in an object created in a later version of the program, such as trying to open an Excel 2003 workbook in Excel 95.

Using the Archive Class

To demonstrate how MFC style serialization works, I created a class Person with various properties - name, age, weight, etc - to illustrate and test all of the different data types. The Person class also has as a property, a list of Person objects - the children for the person. This is to demonstrate how to use MFC style serialization in a hierarchy of objects - something that .NET does automatically. Also, my Person example shows how serialization would work if there were two different versions of the object.

	public class Person : IArchiveSerialization
	{
		public string Name;
		public int Age;
		public double Weight;
		public float Height;
		public DateTime Birthday;
		public Char Sex;
		public bool Deceased;
		public Guid guid;
		public decimal AccountBalance;

		// list of children to show serialization of a heriarchy
		public List<Person> Children;

		public Person()
		{
			Children = new List<Person>();
			guid = Guid.NewGuid();

		}

		public void WriteToConsole()
		{
			Console.WriteLine("Name: " + Name);
			Console.WriteLine("Guid: " + guid.ToString());
			Console.WriteLine("Age: " + Age);
			Console.WriteLine("Weight: " + Weight);
			Console.WriteLine("Height: " + Height);
			Console.WriteLine("Birthday: " + Birthday);
			Console.WriteLine("Sex: " + Sex);
			Console.WriteLine("Deceased: " + Deceased);
			Console.WriteLine("Account Balance: " + AccountBalance);

			Console.WriteLine("{0} has {1} children.", Name,
                               Children.Count);
			foreach (Person child in Children)
			{
				child.WriteToConsole();
				Console.WriteLine();
			}
		}

		virtual public void Serialize(Archive ar)
		{
			// with each change in version, increment this value
			UInt32 version = 0x00000001;
			
			if (ar.IsStoring())
			{
				// write properties to the file
				ar.Write(version);
				ar.Write(guid);
				ar.Write(Sex);
				ar.Write(Age);
				ar.Write(Name);
				ar.Write(Weight);

				int nChildren = Children.Count;
				ar.Write(nChildren);

				foreach (Person p in Children)
				{
					ar.Serialize(p);
				}

				// version 1 objects
				ar.Write(Height);
				ar.Write(Birthday);
				ar.Write(Deceased);

				// as new versions are created, add them here
			}
			else
			{
				ar.Read(out version);

				// with each change in version, increment this value
				if (version > 0x00000001)
				{
					// in this situation, we are trying to read
                                              // an object of a later version than what
                                              // we can support
					throw new VersionException();
				}

				ar.Read(out guid);
				ar.Read(out Sex);
				ar.Read(out Age);
				ar.Read(out Name);
				ar.Read(out Weight);

				Children.Clear();
				int nMaxChildren;
				ar.Read(out nMaxChildren);

				for (int n = 0; n < nMaxChildren; ++n)
				{
					Person child = new Person();
					ar.Serialize(child);
					Children.Add(child);
				}


				// version 1 objects
				if (version > 0x00000000)
				{
					ar.Read(out Height);
					ar.Read(out Birthday);
					ar.Read(out Deceased);
				}

				// as new versions are created, serialize read
                                     // the properties here
			}
		} // end of serialize
	} // end of class

Important Points and Differences from .NET

Note that the versioning mechanism is completely manual. For every class that will be serialized, you should first serialize a version flag. With each change in the version, the flag is increased by 1. Properties Height, Birthday, and Deceased are in the second version (version 0x00000001) of the object.
Unlike with .NET, order of serialization is important. Properties must be loaded in the same order they are written. Once the order is established, it cannot be changed without creating a new version.
In .NET, you cannot change a property name once you have begun using the application in production (well, you can, but it requires more complex code changes). In MFC style serialization, you can change property names without changing the serialization version (remember that MFC does not think of properties the way .NET thinks of properties). However, changing their type requires changes to the serialization version.
I believe that .NET only serializes members explicityly created as properties (with get/set). If a class member is not exposed a property, it does not get serialized. Using MFC-style serialization, you explicitly code which members get serialized, so it does not matter if they are .NET properties (with get/set) or not.

Nullable Types

In this particular class, I did not support nullable types. Nullable types are a problem for this type of serialization. The best solution is to include with each nullable type, a flag indicating if the object is null. This should be built into the Archive class. This is how MFC serializes objects such as COleDateTime. COleDateTime includes a 4-byte status value - which indicates null or invalid, and 8 bytes for the actual date.

The Example Project

The example project includes a simple console application that writes a hierarchy of Person objects to a file and reads them back again. The class also demonstrates how to use this type of serialization with a memory stream rather than a file stream.

The File System Hierarchy Test Application

Because MFC-style serialization writes properties directly to the stream without using reflection, one would expect it to perform better than .NET serialization - particularly with large hierarchies of many objects. But how much better? I wanted to find out. So what is the easiest way to build a complex hierarchy with many objects? From your computer's file system of course.

To test performance, I wrote a WinForms application. The app reads the file system and creates a hierarchy of file objects. The FileObject stores information such as the file name, extension, and date created. Any file object that is a directory, contains a list of subdirectories and files. This allows me to quickly build a hierarchical list with thousands of objects. The application is included with the project and you can use it on your own file system to test performance.

Once you have built the heirarchy of the file system, you can serialize the hierarchy using .NET binary serialization, .NET XML serialization, and MFC-style serialization. The application will tell you how long each operation took.

Performance Results

I used the test application to time serialization of file objects created from my C Drive. There were a total of 196897 objects in the hierarchy. The performance results are shown below (time in milliseconds):

Serialization Type	Store	Load	Serialized File Size
.NET Binary Serialization	2444 MS	6191 MS	88,422 KB
.NET XML Serialization	2381 MS	1940 MS	17,042 KB
MFC Style Serialization (in .NET)	233 MS	603 MS	9,885 KB

As I expected, MFC-style serialization performed much better than .NET serialization. For .NET binary serialization, an object graph of this size took more than six seconds to load - a long time for an application to be unresponsive. Surprisingly, XML serialization was three times faster than .NET binary serialization.

Also, as expected, the XML file was much larger than either of the binary files. The MFC style binary file was about half the size of the .NET binary file.

Conclusions

Implementing MFC style serialization allows you to take control over your object serialization and does provide a notable performance boost for very large object graphs. For those of you comfortable with the MFC style of serialization, it may have use in your projects.

However, the real benefit is that this method serves as the basis for being able to read into your .NET application data files actually serialized using MFC. That is described in Part 2.