Click here to Skip to main content
15,887,027 members
Articles / Desktop Programming / MFC

XML Serialization for C++ Objects

Rate me:
Please Sign up or sign in to vote.
4.17/5 (20 votes)
24 Aug 2005CPOL9 min read 142.5K   2.7K   67   28
A framework for serializing C++ objects as XML

Overview

Introduction

I have done a fair amount of C# programming. Serialization of .NET objects is pretty straightforward. Actually, it looks straightforward since the grunt work has already been done by the .NET framework itself, as a developer, all we need to do is use these services of the framework. But I love the speed of C++/MFC more so I wanted to create a framework for serializing C++ objects as XML without resorting to .NET.

Background

Well, I have been flirting with the idea of XML serialization for my MFC projects for quite some time. However, I never gave it any serious thought. It all began when an Address Book I have been maintaining for over four years finally failed me. It happened when I was in London and had to lookup one of my friend's email address. Of course, I had my address book application installed on my laptop and I also had a copy of my latest data file. Version mismatch! The data file was from the latest version of the address book and the actual application installed was not the latest version. I realized, if I had used XML for storing the contacts, I would at least be able to read the XML manually to get my work done.

Using the Code

  • Serializable.h - defines ISerializable, IObjectFactory, CProperty classes
  • Contact.h & Contact.cpp - defines CAddress, CContact classes. Both are serializable.
  • Serializer.h - defines a ISerializer interface so that we might some day have other implementations for serialization
  • XMLSerializer.h & XMLSerializer.cpp - defines the CXMLSerializer class which implements the ISerializer interface
  • XMLSerialization.h & XMLSerialization.cpp - the main() application as such

This example was written using VS.NET 2003. You also need MSXML 4 installed on your system.

For serializing an object, basically one needs to know the following things:

  • Which object to serialize
  • Where to serialize
  • What properties of the object to serialize

Of course, you can't have serialization without de-serialization. So we also need to know:

  • Which objects to de-serialize
  • How to create the objects
  • Which properties to set

Rule 1

One of the most important things lacking in C++ is reflection. So I needed a way to "enquire" about an object and get back its properties and their associated values. For this, I decided that any class which want to make use of the framework for serialization should inherit from the interface ISerializable.

Rule 2

The next challenge was properties. Classes can have an infinite number of properties each being of any type. For e.g., a CStudent class may have a property FirstName. Cool, a CString. What if the class has a property called CAddress? Persisting a CString I can, how do I persist a CAddress or any other user defined type for that matter. See golden Rule 1. Any class which requires to be serialized should inherit from ISerializable, so CAddress should also implement the ISerializable interface.

Rule 3

Most objects have properties which may be user defined types but ultimately everything finally comes down to the basic data types like strings, longs, floats, etc. However, for XML, it becomes easy to simply allow only strings. For e.g., the CStudent class may have an int m_nAge property, however for the purpose of XML serialization, it exposes this property to the framework as a string. During de-serialization, the class has the chance to convert the CString age to an int age. This bit of conversion is specific to the CStudent class and the framework knows nothing of it. Any class can directly serialize a CString property. The same is true for CStringList.

Rule 4

Often classes contain not one but a list of objects as one single property. For example, a CAuthor may have a property m_books which represents not one book but a list of books. I prefer to use a CPtrList in cases where multiple objects are being held in a single property. In such cases, the framework allows a CPtrList containing ISerializable derived objects to be serialized as a single property.

Rule 5

This rule deals with de-serialization more than serialization. De-serialization consists of creating real objects. Often, creating objects is not as simple as doing a "new". To isolate the serialization framework from knowing too much about 'how to create the object', I decided to use a factory approach. Any class that wants to (de)-serialize itself should provide a factory class which implements the IObjectFactory interface. This interface has only two methods. It is possible to implement this interface in the class requiring serialization. For example, I have implemented IObjectFactroy within CStudent. If you want you can separate the entity class CStudent from IObjectFactory and have an additional class, maybe CStudentFactory.

Well, the rules may seem a bit daunting but trust me, implementing the ISerializable interface is really easy and while doing so, I have often designed my classes better than what I would I have done before.

A Real Example

I'll explain the CContact class which is present in the demo. First, let's take a look at the ISerializable interface.

C++
#include <afx.h>
#include <afxcoll.h>
//--------------------------------------------------

enum PropertyType
{
    Blank,
    Simple,
    SimpleList,
    Complex,
    ComplexList
};
//--------------------------------------------------
class CProperty;
class IObjectFactory;
//--------------------------------------------------

class ISerializable
{
public:
    virtual            ~ISerializable(){};
    
    virtual int        GetProperties(CStringList& properties) = 0;

    virtual bool    GetPropertyValue(const CString& sProperty, 
                                     CProperty& sValue) = 0;

    virtual bool    SetPropertyValue(const CString& sProperty, 
                                     CProperty& sValue) = 0;    

    virtual bool    HasMultipleInstances() = 0;                                     
    virtual CString GetClassName() = 0;
    virtual CString GetID() = 0;
};
//--------------------------------------------------

Let's take a look at the Contact.h file:

C++
class CContact : public ISerializable, public IObjectFactory
{
private:
    CString        m_sFirstName;
    CString        m_sId;
    CAddress    m_address;
    CStringList m_emails;
    CPtrList    m_addresses;

//.... may more stuff below...removed for illustration

We see that the CContact class wants to be serializable and also implements the IObjectFactory interface. Now let's see how CContact implements these functions:

C++
// ISerializable interface
int    CContact::GetProperties(CStringList& properties)
{    
    properties.AddHead(_T("FirstName"));
    properties.AddHead(m_address.GetClassName());
    properties.AddHead(_T("EmailId"));
    properties.AddHead(_T("XAddress"));
    return properties.GetCount();
}
//-----------------------------------------------------------

// Used during serlization
bool CContact::GetPropertyValue(const CString& 
               sProperty, CProperty& property)
{
    if(sProperty == _T("FirstName"))
    {
        property = m_sFirstName;
        return true;
    }
    else if(sProperty == m_address.GetClassName())
    {
        property = (ISerializable*)&m_address;
        property.SetFactory(&m_address); // IMP
        return true;
    }
    else if(sProperty == _T("EmailId"))
    {
        property = m_emails;
        return true;
    }
    else if(sProperty == _T("XAddress"))
    {
        property = m_addresses;
        property.SetFactory(&m_address); // IMP
        return true;
    }

    return false; // this property does not exist
}
//-----------------------------------------------------------

// Used during de-serialization
bool CContact::SetPropertyValue(const CString& sProperty, 
                                CProperty& property)
{
    if(sProperty == _T("FirstName"))
    {
        m_sFirstName = property;
        return true;
    }
    else if(sProperty == _T("ID"))
    {
        m_sId = property;
        return true;
    }
    else if(sProperty == m_address.GetClassName())
    {
        CAddress* address = (CAddress*)(property.GetObject());
        m_address.SetCity(address->GetCity());
        
        // delete the passed in object if we don't need it
        property.GetFactory()->Destroy(address);
        
        return true;
    }
    else if(sProperty == _T("EmailId"))
    {
        CProperty::CopyStringList(m_emails, property.GetStringList());
        return true;
    }
    else if(sProperty == _T("XAddress"))
    {
        // first free any existing objects
        POSITION pos = m_addresses.GetHeadPosition();

        while(pos)
        {
            CAddress* pAddress = (CAddress*)m_addresses.GetNext(pos);
            delete pAddress;
        }
        
        CProperty::CopyPtrList(m_addresses, property.GetObjectList());
        return true;
    }
    
    return false; // this property does not exist
}
//--------------------------------------------------------------

bool CContact::HasMultipleInstances()
{
    return true; // we will have more than one contact instance
}
//--------------------------------------------------------------

CString CContact::GetClassName()
{
    return _T("Contact");
}
//--------------------------------------------------------------

CString CContact::GetID()
{
    return m_sId;
}
//--------------------------------------------------------------

// IObjectFactory Interface

ISerializable* CContact::Create()
{
    return new CContact();
}
//--------------------------------------------------------------

void CContact::Destroy(ISerializable* obj)
{
    delete obj;
}
//--------------------------------------------------------------

Another important class is the CProperty class which acts as a wrapper over a class' property. This class is used only by the serialization framework but you may find this class useful in other situations as well.

Important (ISerializable) Methods

  • GetProperties(CStringList& properties)

    This method is invoked on the entity class (CContact) by the framework. The method should simply add the names of the properties which will be serialized.

  • GetPropertyValue(const CString& sProperty, CProperty& property)

    This method is invoked on the entity class by the framework to find out the value of a property. E.g.:

    C++
    if(sProperty == _T("FirstName"))
    {
        property = m_sFirstName;
        return true;
    }
    else
    {
        return false;
    }

    The first parameter tells us which property the framework is asking for. If the property is "FirstName", then we store the value of first name in the property (second parameter). It is important that we return a true if the property name matched. In case the framework asks for a property which we do not support, we return a false. This should never really happen since the class (CContact) itself tells the list of properties it supports in the GetProperties method.

  • SetPropertyValue(const CString& sProperty, CProperty& property)

    This method is invoked on the entity class by the framework during de-serialization. After creating a new object, the framework has to apply the property values. To do so, it invokes this method:

    C++
    SetPropertyValue(const CString& sProperty, CProperty& property)
    {
        if(sProperty == _T("FirstName"))
        {
            m_sFirstName = property;
            return true;
        }
    }
    else if(sProperty == m_address.GetClassName())
    {
        CAddress* address = (CAddress*)(property.GetObject());
        m_address.SetCity(address->GetCity());
            
        // delete the passed in object if we don't need it
        property.GetFactory()->Destroy(address);
            
        return true;
    }

    The framework passes in the property name as the first parameter and the actual object in the second parameter. Here, we see how the first name is being stored. For properties which are complex types (i.e., user defined objects or UDFs), we are passed in a pointer to the actual de-serialized object. You may want to hold on to this object or delete it. Here, we see how the "Address" property is being treated. We are making a copy of a CAddress object and deleting the passed in object. If we wanted, we could hold on to this object and use it as we see fit. Memory management of the passed in object is not the responsibility of the serialization framework.

  • HasMultipleInstances()

    This method should return a true if multiple instances of the class will be persisted. If not, the method should return a false. In our case, we want to serialize many instances of CContact, therefore we return a true.

  • GetClassName()

    This method should return a name for the class. This is usually not a problem in single applications because these class names will not clash. However, you may want to use names which are GUIDs instead of friendly names like "Contact" as we have done in the example.

  • GetID()

    This method is used to associate an ID string value with a class. This is not really used by the framework but has been added for future use. In case you return a non-empty string after serialization, you will see something like <contact id="001"> in the XML file. If you return an empty string, then you will see only <contact> in the XML file.

Factory Methods

  • Create()

    This method should create a new instance of the class and return the newly created object. See Rule 2. E.g.:

    C++
    ISerializable* CContact::Create()
    {
        // write any extra stuff but ultimately return an object
        return new CContact();
    }
  • Destroy

    This method is responsible for deleting an object. E.g.:

    C++
    void CContact::Destroy(ISerializable* obj)
    {
        delete obj;
    }

Serializing an Object

In the file XMLSerialization.cpp, we have the Serialize method which simply creates a CContact object and serializes it to c:\temp\contacts.xml file.

Note how the CXMLSerializer class is created. The first argument is the file name, the seconds is the name of the application, and the third arguments specifies if the XML files needs to be read (for serialization this should be false and for de-serialization this should be true).

C++
void Serialize()
{
    CXMLSerializer ser(_T("c:\\temp\\contacts.xml"), 
           _T("TestApp"), false); // IMP: true for deserialization

    CContact contact;
    contact.SetFirstName(_T("Meena"));
    contact.GetAddress()->SetCity(_T("Paris"));
    contact.GetEmailIds()->AddHead(_T("meena@hotmail.com"));
    contact.GetEmailIds()->AddHead(_T("meena@yahoo.com"));
    ser.Serialize(&contact);
}

De-Serializing an Object

In the file XMLSerialization.cpp, we have the Deserialize method. It creates the 'ser' object. We see that an instance of CContact is created here. This is because the IObectFactory interface is implemented by CContact, since we need the factory, we create an instance of it. The Deserialize method of the CXMLSerializer expects a CPtrList object as the second parameter. All object(s) created as a result of de-serialization are stored in this CPtrList object.

C++
void Deserialize()
{
    CXMLSerializer ser(_T("c:\\temp\\contacts.xml"), _T("TestApp"), true);
    // IMP: 'true' for deserialization

    CContact contact;
    CPtrList objects;
    int         nObjects;

    nObjects = ser.Deserialize(&contact, objects);

    for(int n = 0; n < nObjects; n++)
    {
        CContact* obj = (CContact*)objects.GetAt(objects.FindIndex(n));
        _tprintf(obj->GetFirstName());
        _tprintf(_T("\n"));
        _tprintf(obj->GetAddress()->GetCity());
        _tprintf(_T("\n"));

        POSITION pos = obj->GetEmailIds()->GetHeadPosition();
        CString sEmail;
        while(pos)
        {
            sEmail = obj->GetEmailIds()->GetNext(pos);
            _tprintf(sEmail);
            _tprintf(_T("\n"));
        }

        pos = obj->GetAddressList()->GetHeadPosition();
        while(pos)
        {
            CAddress* address = 
                      (CAddress*)obj->GetAddressList()->GetNext(pos);
            _tprintf(address->GetCity());
        }

        delete obj;
    }    
}

We have a CString first-name property, a CStringList email IDs property and a CPtrList CAddress property.

This demonstrates how we can have complex objects (classes implementing ISerializable), a list of complex objects, a list of strings (CStringList) properties handled.

To run the demo application, go to the bin folder in the command prompt.

First, we need to serialize the CContact class. Type in c:\demo\bin>XMLSerialization.exe -S c:\contacts.xml. This will create a file called contacts.xml in c:\.

To de-serialize, type in: c:\demo\bin>XMLSerialization -D c:\contacts.xml.

History

This framework makes serializing C++ objects as XML a snap. All you need to do is follow a couple of rules and you are on your way. Based on feedback, I would like to support a few more collection classes like CMaps and CArrays.

Please leave your comments below. I'd love to hear from you.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Software Developer (Senior)
United States United States
My personal website is at http://sbytestream.pythonanywhere.com

Comments and Discussions

 
GeneralRegarding SetFactory Pin
prashu1009-Dec-13 18:04
prashu1009-Dec-13 18:04 
AnswerRe: Regarding SetFactory Pin
Siddharth R Barman29-Jul-14 9:40
Siddharth R Barman29-Jul-14 9:40 
GeneralXMLFoundation Pin
Brian Aberle19-Aug-09 7:37
professionalBrian Aberle19-Aug-09 7:37 
GeneralEasy way to speed up serialization dramatically Pin
tombo8618-Nov-08 19:21
tombo8618-Nov-08 19:21 
Generalserialize list of objects Pin
marcoooooo26-Aug-08 5:54
marcoooooo26-Aug-08 5:54 
QuestionXML file format - no indenting? Pin
tombo8630-Jul-08 18:32
tombo8630-Jul-08 18:32 
AnswerRe: XML file format - no indenting? Pin
Siddharth R Barman30-Jul-08 19:43
Siddharth R Barman30-Jul-08 19:43 
GeneralRe: XML file format - no indenting? Pin
tombo8631-Jul-08 7:38
tombo8631-Jul-08 7:38 
GeneralXML Serialization with MFC Pin
Michael Frawley31-Mar-08 5:00
Michael Frawley31-Mar-08 5:00 
AnswerRe: XML Serialization with MFC Pin
Siddharth R Barman31-Mar-08 21:20
Siddharth R Barman31-Mar-08 21:20 
GeneralMSXML 6.0 Pin
cjacquel200717-Aug-07 2:23
cjacquel200717-Aug-07 2:23 
GeneralRe: MSXML 6.0 Pin
cjacquel200721-Aug-07 22:00
cjacquel200721-Aug-07 22:00 
GeneralI got an error Pin
El Ekeko29-May-07 14:35
El Ekeko29-May-07 14:35 
GeneralRe: I got an error Pin
El Ekeko29-May-07 14:45
El Ekeko29-May-07 14:45 
AnswerRe: I got an error Pin
Siddharth Barman31-May-07 7:57
Siddharth Barman31-May-07 7:57 
BugRe: I got an error Pin
lyzneuq5-May-12 20:49
lyzneuq5-May-12 20:49 
GeneralUpdated source link Pin
Siddharth Barman3-Sep-06 0:01
Siddharth Barman3-Sep-06 0:01 
QuestionRe: Updated source link Pin
Michael Frawley11-Jan-07 3:04
Michael Frawley11-Jan-07 3:04 
AnswerRe: Updated source link Pin
Siddharth Barman14-Jan-07 5:42
Siddharth Barman14-Jan-07 5:42 
GeneralUpdate for VS2005 Pin
Siddharth Barman2-Sep-06 21:00
Siddharth Barman2-Sep-06 21:00 
GeneralRe: Update for VS2005 Pin
Le Roi1-May-07 22:59
Le Roi1-May-07 22:59 
GeneralRe: Update for VS2005 Pin
Siddharth Barman2-May-07 2:31
Siddharth Barman2-May-07 2:31 
GeneralRe: Update for VS2005 Pin
rehanone30-Sep-08 3:07
rehanone30-Sep-08 3:07 
SuggestionRe: Update for VS2005 Pin
lyzneuq5-May-12 21:30
lyzneuq5-May-12 21:30 
GeneralAbout this project Pin
gabriel92725-Aug-05 6:06
gabriel92725-Aug-05 6:06 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.