All About MFC Serialization

steveb

4.99/5 (47 votes)

Mar 16, 2017

MIT

19 min read

70112

1580

Describes all aspects of MFC serialization mechanism

Download demo - 785.2 KB

Background
What is Serialization
How does it work
Word of Caution
Why can’t I call ar.GetObjectSchema() multiple times?
Serializing Base and Derived Classes
1st Solution: Do all serialization in the derived class
2nd Solution: Pop the schema back into the CArchive
3rd Solution: Consider Overhaul of Serialize function with Don’t Call Us, We Will Call You design pattern
4th Solution: Store your base class schema as the 1st member of your class
Serializing Pure Base Class
Serializing with Document/View
Serializing without Document/View
Serializing plain old data types
Serializing CArray template collection
Serializing to and from the process memory
Serializing to and from the shared process memory
Serializing to and from the sockets
Serializing arbitrary byte stream
Serializing Windows SDK data structures
Serializing STL collections
Serializing STL data types
Serializing flat C style arrays
Serializing enumerated types
Serialization versioning for CObject derived classes
Serialization versioning for non CObject classes
Caveats
Using the code
History

Introduction

The world of data structures is a vast one. And when we need to write and read those enormous blobs of data to or from the disk, memory, or sockets, MFC serialization is a powerful tool in every programmer’s tool box.

Background

Serialization was part of the MFC (Microsoft Foundation Classes) library since its very first introduction, but I felt it has never received its proper dues because it was largely undocumented. SDK samples that demonstrated the serialization were very limited and covered serialization of the plain old data and CObject derived classes and collections. However with the right extensions we can serialize any data structure in existence, STL collections, user defined collections, any collections (including flat C style arrays). It is undoubtedly is the most powerful, efficient, and blazingly fast way to store and retrieve hierarchical data to and from the disk, memory, or sockets. MFC Serialization supports read write to the disk, memory, and sockets. Writing to the memory is very useful for inter process communications such as clipboard cut/copy/paste operations and writing to sockets is useful when networking with remote machines. I will cover in this article plain old MFC serialization with MFC provided classes, how to serialize STL collections, how to serialize plain Windows SDK data structures, how to serialize C style arrays, how to serialize to process and shared memory and how to serialize to and from sockets. Also I will demonstrate how to use MFC Serialization with or without Document/View architecture such as inside the console applications and TCP/IP servers.

What is Serialization

MSDN documentation gives us the best description:

Serialization is the process of converting an object into a stream of bytes in order to store the object or transmit it to memory, a database, or a file. Its main purpose is to save the state of an object in order to be able to recreate it when needed. The reverse process is called deserialization.

MFC serialization implements binary and text serializations. Binary handled via shift operators (<<, >>) and WriteObject / ReadObject functions. Text serialization is handled with ReadString / WriteString functions.

MFC serialization provides serialization of C++ CObject derived classes with versioning. With the right extensions it can provide serialization for non CObject derived classes. However the versioning in those cases need to be handled manually.

How does it work

In the heart of the MFC serialization lays the CArchive object. CArchive has no base class and it is tightly coupled to work with CFile and CFile derived classes, such as CSocketFile, CSharedFile, or CMemFile. CArchive internally encapsulates an array of bytes that are dynamically grown as needed and are written or read to or from the CFile or CFile derived object.

CFile – provides serialization to or from disk
CMemFile – provides serialization to or from process memory
CSharedFile – provides serialization to or from processes shared memory which is accessible by the other processes
CSocketFile – provides serialization to or from CSocket for network communications
You can also serialize over Named pipes, RPC and other Windows inter process communication mechanisms

CArchive provides serialization of plain old data and C++ CObject derived classes with versioning. To make a CObject class serializable all you need is to add a macro:

// In the class declaration
DECLARE_SERIAL(CRoot)
 
// In the class implementation
IMPLEMENT_SERIAL(CRoot, CObject, VERSIONABLE_SCHEMA | 1)

Those two macros are adding global extraction operator >> (which calls to CArchive::ReadObject), static function CreateObject, and CRuntimeClass member variable to your class. CRuntimeClass structure has m_lpszClassName member which stores the text representation of your class name. CRuntimeClass also has m_wSchema that holds version information of your class.

These macros internally expand to the following code

//
// DECLARE_SERIAL(CRoot) expands to
//
public:
         static CRuntimeClass classCRoot;
         virtual CRuntimeClass* GetRuntimeClass() const;
         static CObject* PASCAL CreateObject();
         AFX_API friend CArchive& AFXAPI operator >> (CArchive& ar, CRoot* &pOb);
 
 
 
//
// IMLEMENT_SERIAL(CRoot, CObject, VERSIONABLE_SCHEMA | 1) expands to
//
CObject* PASCAL CRoot::CreateObject()
{
         return new CRoot;
}
 
extern AFX_CLASSINIT _init_CRoot;
 
AFX_CLASSINIT _init_CRoot (RUNTIME_CLASS(CRoot));
 
CArchive& AFXAPI operator >> (CArchive& ar, CRoot * &pOb)
{
         pOb = (CRoot *)ar.ReadObject(RUNTIME_CLASS(CRoot));
         return ar;
}
 
AFX_COMDAT CRuntimeClass CRoot::classCRoot =
{
         "CRoot", // Name of the class
         sizeof(class CRoot), // size
         VERSIONABLE_SCHEMA | 1, // schema
         CRoot::CreateObject, // pointer to CreateObject function used to intantiate object
         RUNTIME_CLASS(CObject), // Base class runtime information
         NULL, // linked list of the next class always NULL
         &_init_CRoot // pointer to AFX_CLASSINIT structure
};
 
CRuntimeClass* CRoot::GetRuntimeClass() const
{
         return RUNTIME_CLASS(CRoot);
}

There is no insertion operator << because CArchive stores CObject derived class through the base class pointer declared in the global namespace.

CArchive& AFXAPI operator<<(CArchive& ar, const CObject* pOb);

Plain old data is handled rather straightforward. Here is an example of reading and writing float data type:

//
// Storing
//
CArchive& CArchive::operator<<(float f)
{ 
         if(!IsStoring())
                 AfxThrowArchiveException(CArchiveException::readOnly,m_strFileName);
         if (m_lpBufCur + sizeof(float) > m_lpBufMax) 
                 Flush();
         *(UNALIGNED float*)m_lpBufCur = f; // Write float into the byte array 
         m_lpBufCur += sizeof(float);       // Increment buffer pointer by the size of the float
         return *this;
}

Following code is loading code for the float data type

//
// Loading
//
CArchive& CArchive::operator>>(float& f)
{ 
         if(!IsLoading())
                 AfxThrowArchiveException(CArchiveException::writeOnly,m_strFileName);
         if (m_lpBufCur + sizeof(float) > m_lpBufMax)
                 FillBuffer(UINT(sizeof(float) - (m_lpBufMax - m_lpBufCur)));
         f = *(UNALIGNED float*)m_lpBufCur; // Assign byte array to float size of the float
         m_lpBufCur += sizeof(float);       // Increment buffer pointer by the size of the float
         return *this; 
}

Reading and writing CObject derived classes a bit bore complex. And it will be covered in the next sections.

Word of Caution

Because all data is stored in a continuous byte buffer it must be read in the exact same order as it was stored. Failure to do so will result in CArchiveException thrown during load.

Why can’t I call ar.GetObjectSchema() multiple times?

To simply put it you cannot call GetObjectSchema more than once per object load for the following reason.

//
// GetObjectSchema
//
UINT CArchive::GetObjectSchema()
{
	UINT nResult = m_nObjectSchema;
	m_nObjectSchema = (UINT)-1; // can only be called once per Serialize
	return nResult;
}

As to why this is so? My best guess a legacy issues. Member variable CArchive::m_nObjectSchema is very different from CRuntimeClass::m_wSchema in a way that the CArchive object schema is read from the file which can potentially contain many objects with many schemas. It holds schema of an object which is currently being read from a file. Think about it. When you de serialize object such as in the following example (Hypothetically m_nObjectSchema left alone):

void CMyClass::Serialize(CArchive& ar)
{
         if (ar.IsStoring())
         {
                 // omitted storing code …
         }
         else
         {                
                 // Loading
                 UINT nSchema = ar.GetObjectSchema();
                 switch(nSchema)
                 {
                 case 1:
                          ar >> m_pObject1; // Version schema 10. Serialize may call GetObjectSchema
                          ar >> m_pObject2; // Version schema 1.  Serialize may call GetObjectSchema
                          ar >> m_pObject3; // Version schema 2.  Serialize may call GetObjectSchema
                          ar >> m_pObject4; // Version schema 15. Serialize may call GetObjectSchema
                 }
         }
 
         // For whatever reason
         if(ar.IsLoading())
         {
                 UINT nSchema = ar.GetObjectSchema(); // schema of this class?
         }
}

The object schema in the above example has been changed 4 times by the time you finished the loading section of the code. My guess is to eliminate subtle erroneous behavior the MFC framework decided to cut it short at the very source instead of programmers scratching their head as to why their precious data was hosed away.

The GetObjectSchema can only be called once per object load because framework forcefully resets it to (UINT)-1 after each call to the CArchive::GetObjectSchema.

Even the above example in today’s MFC library is fool proof. Listing from the CArchive::ReadObject has following code

//
// CObject* CArchive::ReadObject(const CRuntimeClass* pClassRefRequested)
//
 
//... omitted code
 
TRY
{
         // allocate a new object based on the class just acquired
         pOb = pClassRef->CreateObject();
//... omitted code
         // Serialize the object with the schema number set in the archive
         UINT nSchemaSave = m_nObjectSchema; // Save current schema
         m_nObjectSchema = nSchema; // put new schema into the CArchive::m_nObjectSchema
         pOb->Serialize(*this); // Call virtual Serialize
         m_nObjectSchema = nSchemaSave; // Pop the saved schema back
}

As you can see it saves current m_nObjectSchema into the nSchemaSave. Assigns current object schema to the m_nObjectSchema. Call Serialize. Pop saved schema back into the m_nObjectSchema. Thus the object schema will never go astray.

Serializing Base and Derived Classes

There are four ways to go around of serialization of the derived and base classes in MFC.

But first let’s look first at the subtle problem. Back in a day of the 16 bit MFC implementation the disk space was a precious commodity, as were the RAM. Thus no matter how many derived classes you have in the class hierarchy, their object schema will be always equal to the final child class schema and will be written only once!

//
// 
//
class CBase : public CObject
{
         DECLARE_SERIAL(CBase)
public:
         int m_i;
         float m_f;
         double m_d;
 
         virtual void Serialize(CArchive& ar);
};
 
class CDerived : public CBase
{
         DECLARE_SERIAL(CDerived)
public:
         long m_l;
         unsigned short m_us;
         long long m_ll;
 
         virtual void Serialize(CArchive& ar);
};
 
// Base class version
IMPLEMENT_SERIAL(CBase, CObject, VERSIONABLE_SCHEMA | 1) // Useless schema number. Never written to the file!
 
void CBase::Serialize(CArchive& ar)
{
         if (ar.IsStoring())
         {        // storing code omitted
         }
         else
         {        // loading code
                 UINT nSchema = ar.GetObjectSchema();
 
                 // oh no! nSchema = 2              
 
                 switch (nSchema)
                 {
                 case 1:
                          ar >> m_i;
                          ar >> m_f;
                          ar >> m_d;
                          break;
                 }
         }
}
 
// Derived class version
IMPLEMENT_SERIAL(CDerived, CBase, VERSIONABLE_SCHEMA | 2) // actual schema that is written to the file
 
void CDerived::Serialize(CArchive& ar)
{
         CBase::Serialize(ar);
 
         if (ar.IsStoring())
         {        // storing code omitted
         }
         else
         {        // loading code
                 UINT nSchema = ar.GetObjectSchema();
 
                 // oh no! nSchema = (UINT)-1 because of 2<sup>nd</sup> call to GetObjectSchema
 
                 switch (nSchema)
                 {
                 case 1:
                 case 2:
                          ar >> m_l;
                          ar >> m_us;
                          ar >> m_ll;
                          break;
                 }
         }
}

Why is that? Quick look at the binary file dump reveals that for the CSerializableDerived class the schema is written only once and it is always equals to the instantiated object schema. In this case it is equal CSerializableDerived class schema even if the base class schema equals to something else.

Tracing into the CArchive::WriteObject reveals to us this code:

//
// void CArchive::WriteObject(const CObject* pOb)
//
 
// … omitted code
 
// write class of object first
CRuntimeClass* pClassRef = pOb->GetRuntimeClass(); // Contains m_wSchema of the CSerializableDerived which = 2
WriteClass(pClassRef);
 
// … omitted code

Tracing into the CArchive::WriteClass framework first writes wNewClassTag WORD value which is equal to 0xFFFF. Then it calls CRuntimeClass::Store function

//
// void CArchive::WriteClass(const CRuntimeClass* pClassRef)
//
 
// … omitted code
 
// store new class
*this << wNewClassTag; // New class tag = 0xFFFF
pClassRef->Store(*this);
 
// … omitted code

The CRuntimeClass::Store function obtains the length of the class name and writes object schema followed by the length of the class name and the class name itself. Herein lies the answer to the queston why the object schema written only once for the derived most class.

//
// 
//
void CRuntimeClass::Store(CArchive& ar) const
         // stores a runtime class description
{
         WORD nLen = (WORD)AtlStrLen(m_lpszClassName); // Get the length of the class name
         ar << (WORD)m_wSchema << nLen;                // Write schema followed by length of the class name into the file. Written only once!!!
         ar.Write(m_lpszClassName, nLen*sizeof(char)); // Write class name into the file
}

After CRuntimeClass information was written to the file the framework finally calls virtual Serialize function of our object:

//
// void CArchive::WriteObject(const CObject* pOb)
//
 
// … omitted code
 
// cause the object to serialize itself
((CObject*)pOb)->Serialize(*this);
 
// … omitted code

Exact opposite happens during object load. First the extraction operator is called. This operator is provided by the IMPLEMENT_SERIAL macro.

//
// Global extraction operator call provided by the IMPLEMENT_SERIAL macro
//
CArchive& AFXAPI operator >> (CArchive& ar, CSerializableDerived* &pOb)
{
         pOb = (CSerializableDerived*)ar.ReadObject(RUNTIME_CLASS(CSerializableDerived));
         return ar;
}

Tracing into the CArchive::ReadObject reveals us following code

//
// CObject* CArchive::ReadObject(const CRuntimeClass* pClassRefRequested)
//
 
// ... omitted code
 
 
// attempt to load next stream as CRuntimeClass
UINT nSchema;
DWORD obTag;
CRuntimeClass* pClassRef = ReadClass(pClassRefRequested, &nSchema, &obTag);
 
// ... omitted code

CArchive::ReadClass function first reads the object tag

//
// CRuntimeClass* CArchive::ReadClass(const CRuntimeClass* pClassRefRequested,
//       UINT* pSchema, DWORD* pObTag)
//
 
// ... omitted code
 
// read object tag - if prefixed by wBigObjectTag then DWORD tag follows
DWORD obTag;
WORD wTag;
*this >> wTag; // Read the object tag
if (wTag == wBigObjectTag)
         *this >> obTag;
else
         obTag = ((wTag & wClassTag) << 16) | (wTag & ~wClassTag);
 
// ... omitted code
 
CRuntimeClass* pClassRef;
UINT nSchema;
if (wTag == wNewClassTag)
{
         // defined as follows
         // #define wNewClassTag    ((WORD)0xFFFF)      // special tag indicating new CRuntimeClass
 
         // new object follows a new class id
         if ((pClassRef = CRuntimeClass::Load(*this, &nSchema)) == NULL) // Read CRuntimeClass information from the file
                 AfxThrowArchiveException(CArchiveException::badClass, m_strFileName);
 
         // ... omitted code
}
// ... omitted code

Following is the listing of the CRuntimeClass::Load function. Please note that the class name cannot exceed 64 characters. If the length of the class name is greater or equal to 64 characters or the CArchive::Read failed to read the class name from the file then function returns NULL. If the class name successfully read from a file the szClassName is NULL terminated at the nLen length value and is looked up in the CRuntimeClass::FromName

//
// 
//
CRuntimeClass* PASCAL CRuntimeClass::Load(CArchive& ar, UINT* pwSchemaNum)
         // loads a runtime class description
{
         if(pwSchemaNum == NULL)
         {
                 return NULL;
         }
         WORD nLen;
         char szClassName[64];
 
         WORD wTemp;
         ar >> wTemp; *pwSchemaNum = wTemp; // Read the schema
         ar >> nLen; // Read the length of the class name
 
         // load the class name
         if (nLen >= _countof(szClassName) ||
                 ar.Read(szClassName, nLen*sizeof(char)) != nLen*sizeof(char))
         {
                 return NULL;
         }
         szClassName[nLen] = '\0';
 
         // match the string against an actual CRuntimeClass
         CRuntimeClass* pClass = FromName(szClassName);
         if (pClass == NULL)
         {
                 // not found, trace a warning for diagnostic purposes
                 TRACE(traceAppMsg, 0, "Warning: Cannot load %hs from archive.  Class not defined.\n",
                          szClassName);
         }
 
         return pClass;
}

CRuntimeClass::FromName simply iterates through the AFX_MODULE_STATE::m_classList and does a comparison by name. If the class found CRuntimeClass pointer is returned. AFX_MODULE_STATE CRuntimeClass discovery is whole another topic that deserves its own article. But suffice it to say that this feature was implemented prior to RTTI (Run Time Type Information) compiler support and it allows runtime type discovery of the MFC classes with RTTI compiler switch turned off. As a matter of fact default setting for the Visual C++ 6.0 RTTI switch was off.

//
// 
//
CRuntimeClass* PASCAL CRuntimeClass::FromName(LPCSTR lpszClassName)
{
         CRuntimeClass* pClass=NULL;
 
         ENSURE(lpszClassName);
 
         // search app specific classes
         AFX_MODULE_STATE* pModuleState = AfxGetModuleState();
         AfxLockGlobals(CRIT_RUNTIMECLASSLIST);
         for (pClass = pModuleState->m_classList; pClass != NULL;
                 pClass = pClass->m_pNextClass)
         {
                 if (lstrcmpA(lpszClassName, pClass->m_lpszClassName) == 0)
                 {
                          AfxUnlockGlobals(CRIT_RUNTIMECLASSLIST);
                          return pClass;
                 }
         }
         AfxUnlockGlobals(CRIT_RUNTIMECLASSLIST);
#ifdef _AFXDLL
         // search classes in shared DLLs
         AfxLockGlobals(CRIT_DYNLINKLIST);
         for (CDynLinkLibrary* pDLL = pModuleState->m_libraryList; pDLL != NULL;
                 pDLL = pDLL->m_pNextDLL)
         {
                 for (pClass = pDLL->m_classList; pClass != NULL;
                          pClass = pClass->m_pNextClass)
                 {
                          if (lstrcmpA(lpszClassName, pClass->m_lpszClassName) == 0)
                          {
                                   AfxUnlockGlobals(CRIT_DYNLINKLIST);
                                   return pClass;
                          }
                 }
         }
         AfxUnlockGlobals(CRIT_DYNLINKLIST);
#endif
 
         return NULL; // not found
}

Back into the CArchive::ReadClass it returns back CRuntimeClass, pSchema, and pObTag pointers.

//
// 
//CRuntimeClass* CArchive::ReadClass(const CRuntimeClass* pClassRefRequested,
//       UINT* pSchema, DWORD* pObTag)
 
//... omitted code
 
 
// store nSchema for later examination
if (pSchema != NULL)
         *pSchema = nSchema;
else
         m_nObjectSchema = nSchema; 
 
// store obTag for later examination
if (pObTag != NULL)
         *pObTag = obTag;
 
// return the resulting CRuntimeClass*
return pClassRef;

After CRuntimeClass pointer were successfully obtained the framework calls CreateObject which is provided by the DECLARE_SERIAL and IMPLEMENT_SERIAL macros.

stores current CArchive::m_nObjectScema into the nSchemaSave
Assigns current CRuntimeClass schema to the CArchive::m_nObjectSchema
Calls virtual Serialize function
Pops the nSchemaSave back into the CArchive::m_nObjectSchema

//
// CObject* CArchive::ReadObject(const CRuntimeClass* pClassRefRequested)
//
 
//... omitted code
 
TRY
{
         // allocate a new object based on the class just acquired
         pOb = pClassRef->CreateObject();
         
//... omitted code
         // Serialize the object with the schema number set in the archive
         UINT nSchemaSave = m_nObjectSchema; // Save current schema
         m_nObjectSchema = nSchema; // put new schema into the CArchive::m_nObjectSchema
         pOb->Serialize(*this); // Call virtual Serialize
         m_nObjectSchema = nSchemaSave; // Pop the saved schema back
         ASSERT_VALID(pOb);
}

So now you know why your class will only have one schema regardless of how many classes you have in your class hierarchy.

How do we address this issue? There are four ways to go around it. Some are more elegant then the others. Let us look at all of those. Of course this applies only to the cases when you must maintain versions throughout all of your classes. The easiest way is not to version anything however in the real life if your application life expectancy measured in decades it is absolutely imperative to maintain versioning right from the start.

1^st Solution: Do all serialization in the derived class

This is less elegant solution but it works and eliminates all surprises. For our above example this code will look like this:

 
// Derived class version
IMPLEMENT_SERIAL(CDerived, CBase, VERSIONABLE_SCHEMA | 2)
 
void CDerived::Serialize(CArchive& ar)
{
         // Do not call base class
         // CBase::Serialize(ar);
 
         if (ar.IsStoring())
         {        // storing code
                 // serialize base members
                 ar << m_i;
                 ar << m_f;
                 ar << m_d;
 
                 // serialize this class members
                 ar << m_l;
                 ar << m_us;
                 ar << m_ll;
 
         }
         else
         {        // loading code
                 UINT nSchema = ar.GetObjectSchema();
 
                 
                 switch (nSchema)
                 {
                 case 1:
                 case 2:
                          // deserialize base members
                          ar >> m_i;
                          ar >> m_f;
                          ar >> m_d;
 
                          // deserialize this class members
                          ar >> m_l;
                          ar >> m_us;
                          ar >> m_ll;
                          break;
                 }
         }
}

This solution is not very pretty. And if your base class has too many members your Serialize function can potentially be enormous.

2^nd Solution: Pop the schema back into the CArchive

This solution a bit more elegant however you would still need to increment schemas in all of base classes when schema changes.

// Base class version
IMPLEMENT_SERIAL(CBase, CObject, VERSIONABLE_SCHEMA | 1) // Useless schema number
 
void CBase::Serialize(CArchive& ar)
{
         if (ar.IsStoring())
         {        // storing code omitted
         }
         else
         {        // loading code
                 UINT nSchema = ar.GetObjectSchema();
 
                 // oh no, nSchema = 2              
 
                 switch (nSchema)
                 {
                 case 1:
                 case 2: // THIS IS REQUIRED!!!
                          ar >> m_i;
                          ar >> m_f;
                          ar >> m_d;
                          break;
                 }
 
                 // Pop the schema back into the CArchive for derived class to use
                 ar.SetObjectSchema(nSchema);
         }
}
 
// Derived class version
IMPLEMENT_SERIAL(CDerived, CBase, VERSIONABLE_SCHEMA | 2)
 
void CDerived::Serialize(CArchive& ar)
{
         // Call base class
         CBase::Serialize(ar);
 
         if (ar.IsStoring())
         {        // storing code omitted
         }
         else
         {        // loading code
                 UINT nSchema = ar.GetObjectSchema();
 
                 switch (nSchema)
                 {
                 case 1:
                 case 2:
                          ar >> m_l;
                          ar >> m_us;
                          ar >> m_ll;
                          break;
                 }
         }
}

3^rd Solution: Consider Overhaul of Serialize function with Don’t Call Us, We Will Call You design pattern

Adding private virtual function SerializeImpl(CArchive& ar, UINT nSchema) will eliminate need to call CArchive::GetObjectSchema more than once.

//
// 
//
class CBase : public CObject
{
	DECLARE_SERIAL(CBase)
public:
	int m_i;
	float m_f;
	double m_d;

	virtual void Serialize(CArchive& ar);

private:
	virtual void SerializeImpl(CArchive& ar, UINT nSchema);
};

class CDerived : public CBase
{
	DECLARE_SERIAL(CDerived)
public:
	long m_l;
	unsigned short m_us;
	long long m_ll;

private:
	virtual void SerializeImpl(CArchive& ar, UINT nSchema);
};

// Base class version
IMPLEMENT_SERIAL(CBase, CObject, VERSIONABLE_SCHEMA | 1) // Useless schema number

void CBase::Serialize(CArchive& ar)
{
	if (ar.IsStoring())
	{	// storing code omitted
		// CDerived::SerializeImpl version will be called
		SerializeImpl(ar, (UINT)-1);
	}
	else
	{	// loading code
		UINT nSchema = ar.GetObjectSchema();
		switch (nSchema)
		{
		case 1:
		case 2: // THIS IS STILL REQUIRED!!!
			ar >> m_i;
			ar >> m_f;
			ar >> m_d;
			break;
		}

 		// CDerived::SerializeImpl version will be called
		SerializeImpl(ar, nSchema);
	}
}

void CBase::SerializImpl(CArchive& ar, UINT nSchema)
{
	// Not implemented	
}




// Derived class version
IMPLEMENT_SERIAL(CDerived, CBase, VERSIONABLE_SCHEMA | 2)

// Eliminates calling to ar.GetObjectSchema() alltogether
void CDerived::SerializImpl(CArchive& ar, UINT nSchema)
{
	// call base if you have more than 2 parent classes
	// so the parent’s class serialization routine utilized
	CBase::SerializImpl(ar, nShema);

	if (ar.IsStoring())
	{	// storing code omitted
	}
	else
	{	// loading code
		switch (nSchema)
		{
		case 1:
		case 2:
			ar >> m_l;
			ar >> m_us;
			ar >> m_ll;
			break;
		}
	}
}

This is somewhat more elegant but it will still require us to increment version number in the all of the base classes when schema changes.

And here comes the most elegant solution.

4th Solution: Store your base class schema as the 1^st member of your class

Now this solution addresses the shortcomings of the MFC serialization mechanism. You have access to your base class schema via member variable static classCBase::m_wSchema in our example.

// Base class version
IMPLEMENT_SERIAL(CBase, CObject, VERSIONABLE_SCHEMA | 1) // Not so useless schema number

void CBase::Serialize(CArchive& ar)
{
	if (ar.IsStoring())
	{	// storing code

		// store the classCBase.m_wSchema; Added by the DECLARE_SERIAL macro and populated by the IMPLEMENT_SERIAL macro
		// as the very 1st member
		WORD wSchema = (WORD)classCBase.m_wSchema; // Strips VERSIONABLE_SCHEMA and Equals 1 as declared above
		ar << wSchema;
		ar << m_i;
		ar << m_f;
		ar << m_d;
	}
	else
	{	// loading code
		
		// Do not call CArchive::GetObjectSchema!
		//UINT nSchema = ar.GetObjectSchema();
		// Read base object schema
		WORD wSchema = 0;
		ar >> wSchema;
		switch (wSchema) // Equals 1
		{
		case 1:
			ar >> m_i;
			ar >> m_f;
			ar >> m_d;
			break;
		}
	}
}


// Derived class version
IMPLEMENT_SERIAL(CDerived, CBase, VERSIONABLE_SCHEMA | 2)

void CDerived:: Serialize(CArchive& ar)
{
	CBase::Serialize(ar);

	if (ar.IsStoring())
	{	// storing code omitted
	}
	else
	{	// loading code
		UINT nSchema = ar.GetObjectSchema(); // equals 2
		switch (nSchema)
		{
		case 1:
		case 2:
			ar >> m_l;
			ar >> m_us;
			ar >> m_ll;
			break;
		}
	}
}

This is the most elegant solution because it frees you from the maintenance of the base classes at the cost of adding a sizeof(WORD) to you file per every parent class.

Serializing Pure Base Class

Suppose you have a CObject derived class with pure virtual functons.

//
// CObject derived class with pure virtual functions
//
class CPureBase : public CObject
{
         DECLARE_SERIAL(CPureBase)
public:
         CPureBase();
         virtual ~CPureBase();
         virtual void Serialize(CArchive& ar);
 
         virtual CString CanSerialize() const = 0;
         virtual CString GetObjectSchema() const = 0;
         virtual CString GetObjectRunTimeName() const = 0;
};

Under normal circumstances this will not work because IMPLEMENT_SERIAL macro will add the following function to your code:

//
// Function added by IMPLEMENT_SERIAL macro
//
CObject* PASCAL CPureBase::CreateObject()
{ 
         return new CPureBase; // Compiler error! Cannot instantiate class due to pure virtual functions
}

To work around this issue we would need to create our own version of the IMPLEMENT_SERIAL macro that will return nullptr from the CreateObject function.

//
// Helper macro for Pure base serializable classes
// Removes instancing of the new class in CreateObject()
#define IMPLEMENT_SERIAL_PURE_BASE(class_name, base_class_name, wSchema)\
         CObject* PASCAL class_name::CreateObject() \
                 { return nullptr; } \
         extern AFX_CLASSINIT _init_##class_name; \
         _IMPLEMENT_RUNTIMECLASS(class_name, base_class_name, wSchema, \
                 class_name::CreateObject, &_init_##class_name) \
         AFX_CLASSINIT _init_##class_name(RUNTIME_CLASS(class_name)); \
         CArchive& AFXAPI operator>>(CArchive& ar, class_name* &pOb) \
                 { pOb = (class_name*) ar.ReadObject(RUNTIME_CLASS(class_name)); \
                          return ar; }

Now you can declare your pure base class serializable.

//
// This code compiles
//
IMPLEMENT_SERIAL_PURE_BASE(CPureBase, CObject, VERSIONABLE_SCHEMA | 1)
 
// CPureBase
CPureBase::CPureBase()
{
}
 
CPureBase::~CPureBase()
{
}
 
// CPureBase member functions
void CPureBase::Serialize(CArchive& ar)
{
         if (ar.IsStoring())
         {        // storing code
         }
         else
         {        // loading code
         }
}

Serializing with Document/View

This type of serialization is the most covered in MFC literature. If you have application with the document view architecture, serialization is already part of the CDocument derived class. Serialize override provides necessary code. Typical structure of the code looks like this:

 
void CSerializeDemoDoc::Serialize(CArchive& ar)
{
         if (ar.IsStoring())
         {
                 // Storing
                 ar << m_pRoot;
         }
         else
         {
                 // Loading
                 ar >> m_pRoot;
         }
}

Serializing without Document/View

To serialize without the Document / View say in the console application you would need to add following code to write to the file

//
// Writing to the file
//
CFile file; // Create CFile object
 
// Open CFile object
if (!file.Open(_T("Test.my_ext"), CFile::modeCreate | CFile::modeReadWrite | CFile::shareExclusive))
         return false;
 
// Create CArchive object pass a pointer to CFile and , CArchive::store enumeration
CArchive ar(&file, CArchive::store | CArchive::bNoFlushOnDelete);
 
// write your value
ar << val;
 
// Close CArchive object
ar.Close();
 
// Close CFile object
file.Close();

To de serialize or read without the Document / View use following code

//
// Reading from a file
//
CFile file;
 
if (!file.Open(_T("Test.my_ext"), CFile::modeRead | CFile::shareExclusive))
         return false;
 
CArchive ar(&file, CArchive::load);
 
ar >> val;
 
ar.Close();
file.Close();

Just in a few lines of code you have harnessed the power of the CArchive object.

Serializing plain old data types

CArchive provides following insertion and extraction operators to handle the plain old data storage and retrieval.

//
// CArchive operators
//
// insertion operations
         CArchive& operator<<(BYTE by);
         CArchive& operator<<(WORD w);
         CArchive& operator<<(LONG l);
         CArchive& operator<<(DWORD dw);
         CArchive& operator<<(float f);
         CArchive& operator<<(double d);
         CArchive& operator<<(LONGLONG dwdw);
         CArchive& operator<<(ULONGLONG dwdw);
 
         CArchive& operator<<(int i);
         CArchive& operator<<(short w);
         CArchive& operator<<(char ch);
#ifdef _NATIVE_WCHAR_T_DEFINED
         CArchive& operator<<(wchar_t ch);
#endif
         CArchive& operator<<(unsigned u);
 
         template < typename BaseType , bool t_bMFCDLL>
         CArchive& operator<<(const ATL::CSimpleStringT<BaseType, t_bMFCDLL>& str);
 
         template< typename BaseType, class StringTraits >
         CArchive& operator<<(const ATL::CStringT<BaseType, StringTraits>& str);
 
         template < typename BaseType , bool t_bMFCDLL>
         CArchive& operator>>(ATL::CSimpleStringT<BaseType, t_bMFCDLL>& str);
 
         template< typename BaseType, class StringTraits >
         CArchive& operator>>(ATL::CStringT<BaseType, StringTraits>& str);
 
         CArchive& operator<<(bool b);
 
         // extraction operations
         CArchive& operator>>(BYTE& by);
         CArchive& operator>>(WORD& w);
         CArchive& operator>>(DWORD& dw);
         CArchive& operator>>(LONG& l);
         CArchive& operator>>(float& f);
         CArchive& operator>>(double& d);
         CArchive& operator>>(LONGLONG& dwdw);
         CArchive& operator>>(ULONGLONG& dwdw);
 
         CArchive& operator>>(int& i);
         CArchive& operator>>(short& w);
         CArchive& operator>>(char& ch);
#ifdef _NATIVE_WCHAR_T_DEFINED
         CArchive& operator>>(wchar_t& ch);
#endif
         CArchive& operator>>(unsigned& u);
         CArchive& operator>>(bool& b);
...

If you need to serialize data types which are not declared in the CArchive object, you would need to write your own implementation. We will look at this a bit later when I cover serializing Windows SDK structures.

Serializing CArray template collection

MFC provides serialization support for nearly all of its collection and in order to serialize MFC collections all you need to do is to call collection’s version of Serialize(CArchive& ar). CArray is different because it is a template and the template type isn’t known ahead. And the type potentially may or may not be derived from CObject. Default implementation of the CArray::Serialize function is listed below. All it does is writes size of the CArray during write operation and reads size of the CArray from disk and resizes CArray during read operation. It then kindly forwards the call to SerializeElements<TYPE>() function.

//
// Serialize function forwards call to the SerializeElements<TYPE>()
//
template<class TYPE, class ARG_TYPE>
void CArray<TYPE, ARG_TYPE>::Serialize(CArchive& ar)
{
         ASSERT_VALID(this);
 
         CObject::Serialize(ar);
         if (ar.IsStoring())
         {
                 // Just writes the collection size
                 ar.WriteCount(m_nSize);
         }
         else
         {
                 // Just reads the collection size and resizes the CArray
                 DWORD_PTR nOldSize = ar.ReadCount();
                 SetSize(nOldSize, -1);
         }
         SerializeElements<TYPE>(ar, m_pData, m_nSize);
}

The user must provide appropriate implementation of the SerializeElements<TYPE>() for the type being stored or retrieved from the archive. Following listing demonstrates SerializeElements<TYPE> implementation for CAge class. Please refer to the SerializeDemo project for the implementation details.

//
// 
//
class CAge : public CObject
{
         DECLARE_SERIAL(CAge)
public:
         CAge();
         CAge(int nAge);
         virtual ~CAge();
         virtual void Serialize(CArchive& ar);
 
         UINT m_nAge;
};
 
 
// CArray serialization helper specialized for the CAge class
template<> inline void AFXAPI SerializeElements(CArchive& ar, CAge** pAge, INT_PTR nCount)
{
         for (INT_PTR i = 0; i < nCount; i++, pAge++) 
         {
                 if (ar.IsStoring())
                 {
                          ar << *pAge; // Calls CArchive::WriteObject
                 }
                 else
                 {
                          CAge* p = nullptr;
                          ar >> p; // Calls CArchive::ReadObject
                          *pAge = p;
                 }
         }
}

Serializing to and from the process memory

Serialization to and from memory is supported via CMemFile. CMemFile does not require a file name.

//
// Write to the memory
//
CMemFile file;
CArchive ar(&file, CArchive::store);
 
ar << val;
 
ar.Close();

Serialization from the memory done in the following manner

//
// Read from the memory
//
CMemFile file;
 
// CByteArray aBytes declared and populated elsewhere
file.Attach(m_aBytes.GetData(), m_aBytes.GetSize());
CArchive ar(&file, CArchive::load);
 
ar >> val;
 
ar.Close();

Serializing to and from the shared process memory

Serialization to and from memory is supported via CSharedFile. This is very useful if you want to transfer your serialized object to the clipboard for pasting into another instance of your application or for passing it to another application.

//
// Write to the shared memory to do a clipboard copy operation
//
UINT m_nClipboardFormat = RegisterClipboardFormat(_T("MY_APP_DATA"));
 
CSharedFile file(GMEM_MOVEABLE | GMEM_SHARE | GMEM_ZEROINIT);
CArchive ar(&file, CArchive::store | CArchive::bNoFlushOnDelete);
 
// CView derived class
GetDocument()->Serialize(ar);      
 
EmptyClipboard();
SetClipboardData(m_nClipboardFormat, file.Detach());
CloseClipboard();
 
ar.Close();
file.Close();

Serialization from the shared memory paste operation from the clipboard:

//
// Read from the shared memory clipboard paste
//
 
UINT m_nClipboardFormat = RegisterClipboardFormat(_T("MY_APP_DATA"));
 
if (!OpenClipboard())
         return;
 
CSharedFile file(GMEM_MOVEABLE | GMEM_SHARE | GMEM_ZEROINIT);
         
HGLOBAL hMem = GetClipboardData(m_nClipboardFormat);
 
if (hMem == nullptr)
{
         CloseClipboard();
         return;
}
 
file.SetHandle(hMem);
 
CArchive ar(&file, CArchive::load);
 
// CView derived class
GetDocument()->DeleteContents();
GetDocument()->Serialize(ar);
 
CloseClipboard();
 
ar.Close();
file.Close();

Serializing to and from the sockets

Serialization to and from sockets is done via the CSocketFile class. You can serialize CArchive into the CSocket only if the CSocket is of the type SOCK_STREAM. This topic is a bit more complex than it is described in the MSDN documentation. Official documentation describes that you can write and read to the CSocket with the CSocketFile. This is true for the write operation but for the read operation this is not necessarily true. If your transmitted data size is a few bytes only then yes you can use CSocketFile for the receiving the data. However if you data size is in megabytes (or any size greater than the reading buffer) then you will likely to receive the data in several reads and you will have to accumulate all of it into the CByteArray structure first and only after all the data has been received you can attach it to the CMemFile rather than CSocketFile and de serialize. Trying to read partial data from the CSocketFile usually results in CArchiveException.

//
// Write to the socket. In this case using CSocketFile is fine
//
CSocket sock;
 
if (!sock.Create()) // Defaults to the SOCK_STREAM
         return;
 
// Assuming there is a server running on the local host port 1011
if (!sock.Connect(_T("127.0.0.1"), 1011))
         return;
 
CSocketFile file(&sock);
CArchive ar(&file, CArchive::store | CArchive::bNoFlushOnDelete);
 
// Write value to the socket
ar << m_pRoot;
 
ar.Close();
file.Close();
sock.Close();

Serialization from the socket is a bit more complicated. I am giving the full listing of the class to demonstrate how to properly read large binary data set from the socket. For the full source code listing please refer to the example project SerializeTcpServer.

//
// Read from the socket. In this case using CSocketFile will not work 
// if the transmitted data is greater that the receiving buffer size
//
 
// CSocket derived class declaration
class CSockThread;
 
// CRecvSocket command target
class CRecvSocket : public CSocket
{
public:
         CRecvSocket();
         virtual ~CRecvSocket();
         virtual void OnReceive(int nErrorCode);
 
         CSockThread* m_pThread;   // Parent thread to ensure our server handles multiple connections simultaneously
         CByteArray m_aBytes;      // Array of received bytes
 
private:
         DWORD m_dwReads; // Number of reads
 
         void Display(CRoot* pRoot);
};
 
 
// Implementation
#define INCOMING_BUFFER_SIZE 65536
 
// CRecvSocket
CRecvSocket::CRecvSocket(): m_pThread(nullptr)
, m_dwReads(0)
{
}
 
CRecvSocket::~CRecvSocket()
{
}
 
// CRecvSocket member functions
void CRecvSocket::OnReceive(int nErrorCode)
{
         // yield 10 msec
         Sleep(10);
 
         // Our reading buffer
         BYTE btBuffer[INCOMING_BUFFER_SIZE] = { 0 };
 
         // Read from the socket size of our buffer size or less
         int nRead = Receive(btBuffer, INCOMING_BUFFER_SIZE);
 
         switch (nRead)
         {
         case 0:
                 // No data - Quit
                 m_pThread->PostThreadMessage(WM_QUIT, 0, 0);
                 break;
         case SOCKET_ERROR:
                 if (GetLastError() != WSAEWOULDBLOCK)
                 {
                          // Socket error - Quit
                          m_pThread->PostThreadMessage(WM_QUIT, 0, 0);
                 }
                 break;
         default:
                 // Increment read counter
                 m_dwReads++;
                 
                 // Read into the byte array
                 CByteArray aBytes;
 
                 // Resize our byte array to the size of the received data
                 aBytes.SetSize(nRead);
 
                 // Copy received data into the CByteArray
                 CopyMemory(aBytes.GetData(), btBuffer, nRead);
 
                 // Append received data to m_aBytes member
                 m_aBytes.Append(aBytes);
 
                 DWORD dwReceived = 0;
                 
                 // Look ahead for more incoming data
                 if (IOCtl(FIONREAD, &dwReceived))
                 {
                          // No more incoming data
                          if (dwReceived == 0)
                          {
                                   // We have received all of the incoming data
                          
                                   // Instead of CSocketFile use CMemFile
                                   CMemFile file;
                                   file.Attach(m_aBytes.GetData(), m_aBytes.GetSize());
                                   CArchive ar(&file, CArchive::load);
                                   CRoot* pRoot = nullptr;
 
                                   TRY
                                   {
                                            ar >> pRoot;
                                   }
                                   CATCH(CArchiveException, e)
                                   {
                                            std::cout << "Error reading data " << std::endl;
                                   }
                                   END_CATCH
                 
                                   if (pRoot)
                                   {
                                            // Use our de serialized CRoot class
                                            Display(pRoot);
                                            delete pRoot;
                                   }
                                   ar.Close();
                                   file.Close();
 
                                   // finally quit
                                   m_pThread->PostThreadMessage(WM_QUIT, 0, 0);
                          }
                 }                
         }
         CSocket::OnReceive(nErrorCode);
}

In today’s applications you will rarely receive all transmission of the binary or text data in just one OnReceive call. Thus you need to accumulate all of the data into the array of bytes. And only then you can successfully de serialize it by attaching the accumulated CByteArray to the CMemFile. The above example calls IOCtl(FIONREAD, &dwReceived) to determine if more data is inbound. The rule of thumb is this: because our reading buffer is equal to the 65536 bytes any data transmitted greater than the reading buffer will result in more than one read.

The CSockThread* m_pThread; implementation is provided in the example project SerializeTcpServer.

Serializing arbitrary byte stream

Arbitrary byte stream is basically any binary file that you do not know or do not care about its internal structure. An example is that you want to store a JPEG images or mpeg 4 movies files inside of your class data without any knowledge of the underlying data structure. You may de serialize it later and use it with the appropriate application. The MFC serialization allows you to easily store such data.

In the following code we will store the byte stream of four JPEG pictures

//
// Declare class to hold an array of JPEG images
//
class CMyPicture : public CObject
{
         DECLARE_SERIAL(CMyPicture)
public:
         CMyPicture();
         virtual ~CMyPicture();
         virtual void Serialize(CArchive& ar);
 
         CString GetHeader() const;
 
         CString m_strName;        // Original image file name
         CString m_strNewName;     // New image file
         CByteArray m_bytes;       // Image binary data array
};

typedef CTypedPtrArray<CObArray, CMyPicture*> CMyPictureArray;

Following listing is the body of the class

//
// Class to store JPEG images
//
IMPLEMENT_SERIAL(CMyPicture, CObject, VERSIONABLE_SCHEMA | 1)
 
// CMyPicture
CMyPicture::CMyPicture()
{
}
 
CMyPicture::~CMyPicture()
{
}
 
// CMyPicture member functions
void CMyPicture::Serialize(CArchive& ar)
{
         if (ar.IsStoring())
         {        // storing code
                 ar << m_strName;
                 ar << m_strNewName;
         }
         else
         {        // loading code
                 UINT nSchema = ar.GetObjectSchema();
                 switch (nSchema)
                 {
                 case 1:
                          ar >> m_strName;
                          ar >> m_strNewName;
                          break;
                 }
         }
 
         // Serialize arbitrary byte stream into or from the file
         m_bytes.Serialize(ar);
}

To populate such a class with the JPEG image data all you need to do is following

//
// CMyPictureArray m_aPictures declared in the class header as a member
//
         m_aPictures.Add(InitPicture("Water lilies.jpg", "Water lilies Output.jpg"));
         m_aPictures.Add(InitPicture("Blue hills.jpg", "Blue hills Output.jpg"));
         m_aPictures.Add(InitPicture("Sunset.jpg", "Sunset Output.jpg"));
         m_aPictures.Add(InitPicture("Winter.jpg", "Winter Output.jpg"));
 
         UpdateAllViews(nullptr, HINT_GENERATED_DATA);
         SetModifiedFlag();
}
 
// Read binary stream from an unknown file
std::vector<BYTE> CSerializeDemoDoc::ReadBinaryFile(const char* filename)
{
         std::basic_ifstream<BYTE> file(filename, std::ios::binary);
         return std::vector<BYTE>((std::istreambuf_iterator<BYTE>(file)), std::istreambuf_iterator<BYTE>());
}
 
CMyPicture* CSerializeDemoDoc::InitPicture(const char* sFileName, const char* sOutFileName)
{
         std::vector<BYTE> vJPG = ReadBinaryFile(sFileName);
         CMyPicture* pPicture = new CMyPicture;
         pPicture->m_strName = sFileName;
         pPicture->m_strNewName = sOutFileName;
         pPicture->m_bytes.SetSize(vJPG.size());
         CopyMemory(pPicture->m_bytes.GetData(), (void*)&vJPG[0], vJPG.size() * sizeof(BYTE));
         return pPicture;
}
 
// Writes JPEG images back to the hard drive
void CSerializeDemoDoc::OnTestdataWriteimagedatatodisk()
{
         for (INT_PTR i = 0; i < m_pRoot->m_aPictures.GetSize(); i++)
         {
                 CMyPicture* pPic = m_pRoot->m_aPictures.GetAt(i);
 
                 std::ofstream fout(pPic->m_strNewName, std::ios::out | std::ios::binary);
                 fout.write((char*)pPic->m_bytes.GetData(), pPic->m_bytes.GetSize());
                 fout.close();
         }
 
         AfxMessageBox(_T("Finished writing images back to disk"), MB_ICONINFORMATION);
}

Serializing Windows SDK data structures

Serialization of the Windows SDK structures is not provided by the CArchive class. However it is nearly effortless to add a support for such serialization. Following is the code demonstrates how to serialize LOGFONT SDK structure.

//
// LOGFONT SDK structure serialization code
//
 
// LOGFONT write
inline CArchive& AFXAPI operator <<(CArchive& ar, const LOGFONT& lf)
{
         CString strFace(lf.lfFaceName);
 
         ar << lf.lfHeight;
         ar << lf.lfWidth;
         ar << lf.lfEscapement;
         ar << lf.lfOrientation;
         ar << lf.lfWeight;
         ar << lf.lfItalic;
         ar << lf.lfUnderline;
         ar << lf.lfStrikeOut;
         ar << lf.lfCharSet;
         ar << lf.lfOutPrecision;
         ar << lf.lfClipPrecision;
         ar << lf.lfQuality;
         ar << lf.lfPitchAndFamily;
         ar << strFace;
 
         return ar;
}
 
// LOGFONT read
inline CArchive& AFXAPI operator >> (CArchive& ar, LOGFONT& lf)
{
         CString strFace;
 
         ar >> lf.lfHeight;
         ar >> lf.lfWidth;
         ar >> lf.lfEscapement;
         ar >> lf.lfOrientation;
         ar >> lf.lfWeight;
         ar >> lf.lfItalic;
         ar >> lf.lfUnderline;
         ar >> lf.lfStrikeOut;
         ar >> lf.lfCharSet;
         ar >> lf.lfOutPrecision;
         ar >> lf.lfClipPrecision;
         ar >> lf.lfQuality;
         ar >> lf.lfPitchAndFamily;
         ar >> strFace;
         _tcscpy_s(lf.lfFaceName, strFace);
 
         return ar;
}

After you have defined the LOGFONT extraction and insertion operators all you need to do is following code snippet.

//
// Serialize LOGFONT structure m_lf
//
 
void CRoot::Serialize(CArchive& ar)
{
         CBase::Serialize(ar);
 
         if (ar.IsStoring())
         {        // storing code
                 ar << m_lf; // Write LOGFONT
         }
         else
         {        // loading code
                 UINT nSchema = ar.GetObjectSchema();
                 switch (nSchema)
                 {
                 case 1:
                          ar >> m_lf; // Read LOGFONT
                          break;
                 }
         }
}

Next code snippet serializes WINDOWPLACEMENT SDK structure:

//
// Serializing WINDOWPLACEMENT
//
 
// WINDOWPLACEMENT write
inline CArchive& AFXAPI operator <<(CArchive& ar, const WINDOWPLACEMENT& val)
{
         ar << val.flags;
         ar << val.length;
         ar << val.ptMaxPosition.x;
         ar << val.ptMaxPosition.y;
         ar << val.ptMinPosition.x;
         ar << val.ptMinPosition.y;
         ar << val.rcNormalPosition.bottom;
         ar << val.rcNormalPosition.left;
         ar << val.rcNormalPosition.right;
         ar << val.rcNormalPosition.top;
         ar << val.showCmd;
 
         return ar;
}
 
// WINDOWPLACEMENT read
inline CArchive& AFXAPI operator >> (CArchive& ar, WINDOWPLACEMENT& val)
{
         ar >> val.flags;
         ar >> val.length;
         ar >> val.ptMaxPosition.x;
         ar >> val.ptMaxPosition.y;
         ar >> val.ptMinPosition.x;
         ar >> val.ptMinPosition.y;
         ar >> val.rcNormalPosition.bottom;
         ar >> val.rcNormalPosition.left;
         ar >> val.rcNormalPosition.right;
         ar >> val.rcNormalPosition.top;
         ar >> val.showCmd;
 
         return ar;
}

Then reading and writing the WINDOWPLACEMENT structure becomes as trivial as this

//
// Reading and writing WINDOWPLACEMENT structure
//
void CRoot::Serialize(CArchive& ar)
{
         CBase::Serialize(ar);
 
         if (ar.IsStoring())
         {        // storing code
                 ar << m_wp; // Write WINDOWPLACEMENT struct
         }
         else
         {        // loading code
                 UINT nSchema = ar.GetObjectSchema();
                 switch (nSchema)
                 {
                 case 1:
                          ar >> m_wp; // Read WINDOWPLACEMENT struct
                          break;
                 }
         }
}

Serializing STL collections

Serialization of the STL collection is just as trivial as the serialization of the SDK data structures. Let’s define insertions and extractions operators for the popular STL collections. To serialize std::vector<int> we would need following definitions

//
// STL vector<int> write
//
inline CArchive& AFXAPI operator <<(CArchive& ar, const std::vector<int>& val)
{
	// first store the size of the vector
	ar << (int)val.size();
	for each (int k in val)
	{
		ar << k; // store each int into the file
	}
	return ar;
}

To read the STL vector back into the std::vector<int> we do the following

//
// STL vector<int> read
//
inline CArchive& AFXAPI operator >> (CArchive& ar, std::vector<int>& val)
{
	int nSize;
	ar >> nSize; // read the vector size
	val.resize(nSize); // resize vector to the read size
	for (size_t i = 0; i < (size_t)nSize; i++)
	{
		ar >> val[i]; // retrieve values
	}
	return ar;
}

Serialization of the std::map<char, int> collection. First we store the size of the map. Because underlying element of the std::map<char, int> is a std::pair<char, int> we store the first and the second members of the pair.

//
// std::map<char, int> write
//
 
inline CArchive& AFXAPI operator <<(CArchive& ar, const std::map<char, int>& val)
{
         ar << (int)val.size();
         for each (std::pair<char, int> k in val)
         {
                 ar << k.first;
                 ar << k.second;
         }
         return ar;
}

Reading code for the std::map<char, int> as follows.

//
// std::map<char, int> read 
// 
inline CArchive& AFXAPI operator >> (CArchive& ar, std::map<char, int>& val)
{
         int nSize;
         ar >> nSize;
         for (size_t i = 0; i < (size_t)nSize; i++)
         {
                 std::pair<char, int> k;
                 ar >> k.first;
                 ar >> k.second;
                 val.insert(k);
         }
         return ar;
}

Serialization of the STL fixed size std::array<int, 3>.

//
// STL std::array<int, 3> write
//
inline CArchive& AFXAPI operator <<(CArchive& ar, const std::array<int, 3>& val)
{
         for each (int k in val)
         {
                 ar << k;
         }
         return ar;
}

std::array<int, 3> reading operator.

//
// STL std::array<int, 3> read
//
inline CArchive& AFXAPI operator >> (CArchive& ar, std::array<int, 3>& val)
{
         for (size_t i = 0; i < (size_t)val.size(); i++)
         {
                 ar >> val[i];
         }
         return ar;
}

Serialization of the std::set<std::string> collection.

//
// STL std::set<std::string> write 
// 
inline CArchive& AFXAPI operator <<(CArchive& ar, const std::set<std::string>& val)
{
         ar << (int)val.size(); // write the size first
         for each (std::string k in val)
         {
                 ar << CStringA(k.c_str());
         }
         return ar;
}

Reading code of the std::set<std::string> collection.

//
// STL std::set<std::string> read
//
inline CArchive& AFXAPI operator >> (CArchive& ar, std::set<std::string>& val)
{
         int nSize;
         ar >> nSize;
         for (size_t i = 0; i < (size_t)nSize; i++)
         {
                 CStringA str;
                 ar >> str;
                 val.insert(std::string(str));
         }
         return ar;
}

Serializing STL data types

Serialization of the STL types is just as trivial as the serialization of the SDK data structures. First we need an extraction and the insertion operator definition. To serialize or de serialize std::string we need to add following operators:

//
// STL std::string write
//
 
inline CArchive& AFXAPI operator <<(CArchive& ar, const std::string& val)
{
         ar << CStringA(k.c_str()); // because std::string is ANSI we can pass it as a constructor to the CStringA class
         return ar;
}

De serialize std::string:

//
// STL std::string read
//
inline CArchive& AFXAPI operator >> (CArchive& ar, std::string& val)
{
         CStringA str;
         ar >> str;
         val = str;
         return ar;
}

I will stop here with the STL data and containers serialization implementation. When you saw one STL collection and one STL type serialized, you have seen them all. I will leave it to the reader as an exercise to serialize std::pair, std::tuple, std::unordered_map etc.

Serializing flat C style arrays

To serialize flat C arrays you will follow the same procedure as with serializing collection. But because flat C style array has known size there is no need to store its size in the file.

//
// Write float val[3]
//
inline CArchive& AFXAPI operator <<(CArchive& ar, float val[3])
{
         for(int i = 0; i < 3; i++)
         {
                 ar << val[i];
         }
         return ar;
}

Reading flat C style array.

//
// read float val[3]
//
inline CArchive& AFXAPI operator >> (CArchive& ar, float val[3])
{
         for (size_t i = 0; i < 3; i++)
         {
                 ar >> val[i];
         }
         return ar;
}

Serializing enumerated types

To serialize enumeration you really need an extraction operator because when inserting an enumeration implicitly converted into an int. But providing both the insertion and extraction operators for enumeration results in the much more cleaner solution and potentially eliminates nasty surprises in the future.

//
// Enumeration that we want to serialize
//
enum EMyTestEnum
{
         ENUM_0,
         ENUM_1,
};

Write enumeration code.

//
// Write enumeration EMyTestEnum
//
inline CArchive& AFXAPI operator <<(CArchive& ar, const EMyTestEnum& val)
{
         int iTemp = val;
         ar << iTemp;
         return ar;
}

Read enumeration code.

//
// Read enumeration EMyTestEnum
//
inline CArchive& AFXAPI operator >> (CArchive& ar, EMyTestEnum& val)
{
         int iTmp = 0;
         ar >> iTmp;
         val = (EMyTestEnum)iTmp;
         return ar;
}

Serialization versioning for CObject derived classes

This is rather interesting topic and versioning of the CObject derived can be done in two ways. Let assume we have a class whose version is constantly evolving as the new features are implemented into the core application.

//
// Any source code blocks look like this
//
class CMyObject : public CObject
{
         DECLARE_SERIAL(CMyObject)
public:
         CMyObject();
         virtual ~CMyObject();
         virtual void Serialize(CArchive& ar);
         
         // Version 1 data
         float m_f;
         double m_d;
 
         // Version 2 data
         COLORREF m_backColor;
         COLORREF m_foreColor;
 
         // Version 3 data
         CString m_strDescription;
         
         // Version 4 data
         CString m_strNotes;
};

To serialize such an object and still being able to read the Versions 1, 2, and 3 older files, we can implement this in the following ways.

//
// Version 4 object
//
IMPLEMENT_SERIAL(CMyObject, CObject, VERSIONABLE_SCHEMA | 4)
 
void CMyObject::Serialize(CArchive& ar)
{
         if (ar.IsStoring())
         {        // storing code
                 ar << m_f;
                 ar << m_d;
                 ar << m_backColor;
                 ar << m_foreColor;
                 ar << m_strDescription;
                 ar << m_strNotes;
         }
         else
         {        // loading code
                 UINT nSchema = ar.GetObjectSchema();
                 switch (nSchema)
                 {
                 case 1:
                          ar >> m_f;
                          ar >> m_d;
                          break;
                 case 2:
                          ar >> m_f;
                          ar >> m_d;
                          ar >> m_backColor;
                          ar >> m_foreColor;
                          break;
                 case 3:
                          ar >> m_f;
                          ar >> m_d;
                          ar >> m_backColor;
                          ar >> m_foreColor;
                          ar >> m_strDescription;
                          break;
                 case 4:
                          ar >> m_f;
                          ar >> m_d;
                          ar >> m_backColor;
                          ar >> m_foreColor;
                          ar >> m_strDescription;
                          ar >> m_strNotes;
                          break;
                 }
         }
}

This approach although crystal clear is tedious at best. There is much of the repetitive code. Another approach is to load this data in reverse and let the switch case statement to fall through to the correct version of the file.

//
// Version 4 object
//
IMPLEMENT_SERIAL(CMyObject, CObject, VERSIONABLE_SCHEMA | 4)
 
void CMyObject::Serialize(CArchive& ar)
{
         if (ar.IsStoring())
         {        // storing code
 
                 // Add new features to the top rather than bottom
                 ar << m_strNotes;         // Version 4
                 ar << m_strDescription;   // Version 3
                 ar << m_backColor;        // Version 2
                 ar << m_foreColor;        // Version 2
                 ar << m_f;                // Version 1
                 ar << m_d;                // Version 1              
         }
         else
         {        // loading code
                 UINT nSchema = ar.GetObjectSchema();
                 switch (nSchema)
                 {
                 // Reverse case statements. New version goes on top without break statement 
                 // to let it simply fall through all the versions
                 case 4:
                          ar >> m_strNotes; // fall through to version 3
                 case 3:          
                          ar >> m_strDescription; // fall through to version 2
                 case 2:
                          ar >> m_backColor;
                          ar >> m_foreColor; // fall through to version 1
                 case 1:
                          ar >> m_f;
                          ar >> m_d;
                          break;    // finally break from version 1   
                 }
         }
}

This is much cleaner versioning solution that eliminates all of the repetitive code.

Serialization versioning for non CObject classes

To serialize non CObject derived class we simply will follow same rule as with the Windows SDK structures.

//
// Non CObject class
//
class CMyObject
{
public:
         CMyObject();
         virtual ~CMyObject();
 
         static const short VERSION = 1;
         
         float m_f;
         double m_d;
};

Write the version number as the very first member. Then when reading depending what is the version inside the file you can take it through appropriate read procedure that corresponds to the version loaded.

//
// Serializing CMyObject
//
 
// CMyObject write
inline CArchive& AFXAPI operator <<(CArchive& ar, const CMyObject & val)
{
         ar << val.VERSION; // Write the current version of the class as very first item
         ar << val.m_f;
         ar << val.m_d;
         return ar;
}
 
// CMyObject read
inline CArchive& AFXAPI operator >> (CArchive& ar, CMyObject & val)
{
         short nVersion = 0;
 
         ar >> nVersion;
         
         switch(nVersion)
         {
         case 1:
                  ar >> val.m_f;
                  ar >> val.m_d;
                 break;
         }
         return ar;
}

Caveats

Do not serialize WIN32 and WIN64 typedefs ever! If you upgrade your application to the 64 bit and try to read a file which was created with the 32 bit version of the application, which happened to serialize WIN32/64 typedefs (such as DWORD_PTR) it will fail miserably. Because DWORD_PTR on the 32 bit architecture is 4 bytes long and 8 bytes long on WIN64 so reading 4 bytes into the 8 bytes and vice versa will result in CArchiveException and it will make your file useless to another bit aligned version of your application. Serialize only hard known types. If you must use 64 bit integer then serialize it as __int64 explicitly in both 32 and 64 bit versions of your application. This is especially concerning if you are serializing SDK structures. You will need to carefully examine structure declaration and if there are potentially WIN32/64 typedefs present, explicitly cast them to the largest size if you building 32 bit application and plan to upgrade it to 64 bit in the future.

Stick to either to UNICODE or ANSI period. If for whatever reason you must maintain both ANSI and UNICODE versions of your application then serialize exclusively either CStringA or CStringW so another version can read the file. Suffice it to say that string such as "hello" will be stored as 5 bytes long in ANSI string but 10 bytes long for the UNICODE version.

Link to MFC statically to eliminate runtime dependency from the MFCXX.DLL, or any other 3^rd party library for that matter. Hypothetically if the sizeof(WhateverClass) has changed in a newer version of the 3^rd party DLL and your application dynamically linked to it plus serializes it, your application will fail to read the file. Better safe than sorry. So if you are not in control of the 3^rd party library code, then link to it statically. A little planning ahead goes a long way.

Using the code

I have supplied the SerializeDemo solution project that demonstrates all aspects described in this article. This solution contains 4 subprojects:

SerializeData – houses data structures and operators that are used by all projects
SerializeDemo – MFC Document / View application
SerializeTcpServer – a console server application running on a local host "127.0.0.1" port 1011. You may need to change the port number if is already 1011 occupied on your machine. SerializeDemo application demo application can connect to this server for transmitting serialized data
SerializationWithoutDocView – console application that demonstrates usage of CArchive without Document / View architecture