Click here to Skip to main content
15,884,353 members
Articles / Desktop Programming / MFC

A Serialization Primer - Part 2

Rate me:
Please Sign up or sign in to vote.
4.73/5 (22 votes)
17 Feb 2002CPOL3 min read 210.9K   80   40
This tutorial describes how to handle invalid data stores and support versioning during serialization.

This article is the second of a 3 part tutorial on serialization.

  • Part 1 introduces the basics of serialization.
  • Part 2 explains how to gracefully handle reading invalid data stores and support versioning.
  • Part 3 describes how to serialize complex objects.

In Part 1, we saw how to serialize a simple object via a CArchive using a serialize() method like this:

C++
int CFoo::serialize
  (CArchive* pArchive)
{
  int nStatus = SUCCESS;

  // Serialize the object ...
  ASSERT (pArchive != NULL);
  TRY
  {
    if (pArchive->IsStoring()) {
       // Write employee name and id
       (*pArchive) << m_strName;
       (*pArchive) << m_nId;
    }
    else {
       // Read employee name and id
       (*pArchive) >> m_strName;
       (*pArchive) >> m_nId;
    }
  }
  CATCH_ALL (pException)
  {
    nStatus = ERROR;
  }
  END_CATCH_ALL

  return (nStatus);
}

There's a problem with this code. What if we mistakenly read a datafile that doesn't contain the expected information? If the datafile doesn't contain a CString followed by an int, our serialize() method would return ERROR. That's nice, but it would be better if we could recognize the situation and return a more specific status code like INVALID_DATAFILE. We can check that we're reading a valid datafile (i.e., one that contains a CFoo object) by using an object signature.

Object Signatures

An object signature is just a character string (e.g.: "FooObject") that identifies an object. We add a signature to CFoo by modifying the class definition:

C++
class CFoo
{
  ...

  // Methods
  public:
    ...
    CString getSignature();

  // Data members
    ...
  protected:
    static const CString  Signature;  // object signature
};

The signature is declared in Foo.cpp:

C++
// Static constants
const CString CFoo::Signature = "FooObject";

Next, we modify the serialize() method to serialize the signature before serializing the object's data members. If an invalid signature is encountered, or if the signature is missing, it's likely that we're attempting to read a data store that doesn't contain a CFoo object. Here's the logic for reading a signed object:

Using a signature to validate a data store

And here's the code:

C++
int CFoo::serialize
  (CArchive* pArchive)
{
  int nStatus = SUCCESS;
  bool bSignatureRead = false;

  // Serialize the object ...
  ASSERT (pArchive != NULL);
  TRY
  {
    if (pArchive->IsStoring()) {
       // Write signature
       (*pArchive) << getSignature();

       // Write employee name and id
       (*pArchive) << m_strName;
       (*pArchive) << m_nId;
    }
    else {
       // Read signature - complain if invalid
       CString strSignature;
       (*pArchive) >> strSignature;
       bSignatureRead = true;
       if (strSignature.Compare (getSignature()) != 0) {
          return (INVALID_DATAFILE);
       }

       // Read employee name and id
       (*pArchive) >> m_strName;
       (*pArchive) >> m_nId;
    }
  }
  CATCH_ALL (pException)
  {
    nStatus = bSignatureRead ? ERROR : INVALID_DATAFILE;
  }
  END_CATCH_ALL

  return (nStatus);
}

You should ensure that all your objects have unique signatures. It's less important what the actual signature is. If you're developing a suite of products, it's helpful to have a process for registering object signatures companywide. That way, developers won't mistakenly use the same signature for different objects. If you want to make it harder to reverse engineer your datafiles, you should use signatures that have no obvious connection to object names.

Versioning

As you upgrade your product during its lifecycle, you may find it necessary to modify the structure of CFoo by adding or removing data members. If you simply released a new version of CFoo, attempts to read old versions of the object from a data store would fail. This is obviously not acceptable. Any version of CFoo should be able to restore itself from an older serialized version. In other words, CFoo's serialization method should always be backward compatible. This is easily accomplished by versioning the object. Just as we added an object signature, we add an integer constant that specifies the object's version number.

C++
class CFoo
{
  ...

  // Methods
  public:
    ...
    CString getSignature();
    int     getVersion();

  // Data members
    ...
  protected:
    static const CString  Signature;  // object signature
    static const int      Version;    // object version
};

The object's version is declared in Foo.cpp.

C++
// Static constants
const CString CFoo::Signature = "FooObject";
const int     CFoo::Version = 1;

Next, we modify the serialize() method to serialize the version after serializing the signature, and before serializing the object's data members. If a newer version is encountered, we're attempting to read an unsupported version of the object. In this case, we simply return the status UNSUPPORTED_VERSION.

C++
int CFoo::serialize
  (CArchive* pArchive)
{
  int nStatus = SUCCESS;
  bool bSignatureRead = false;
  bool bVersionRead = false;

  // Serialize the object ...
  ASSERT (pArchive != NULL);
  TRY
  {
    if (pArchive->IsStoring()) {
       // Write signature and version
       (*pArchive) << getSignature();
       (*pArchive) << getVersion();

       // Write employee name and id
       (*pArchive) << m_strName;
       (*pArchive) << m_nId;
    }
    else {
       // Read signature - complain if invalid
       CString strSignature;
       (*pArchive) >> strSignature;
       bSignatureRead = true;
       if (strSignature.Compare (getSignature()) != 0) {
          return (INVALID_DATAFILE);
       }

       // Read version - complain if unsupported
       int nVersion;
       (*pArchive) >> nVersion;
       bVersionRead = true;
       if (nVersion > getVersion()) {
          return (UNSUPPORTED_VERSION);
       }

       // Read employee name and id
       (*pArchive) >> m_strName;
       (*pArchive) >> m_nId;
    }
  }
  CATCH_ALL (pException)
  {
    nStatus = bSignatureRead && bVersionRead ? ERROR : INVALID_DATAFILE;
  }
  END_CATCH_ALL

  return (nStatus);
}

Version 1 of our CFoo contained 2 data members - a CString (m_strName) and an int (m_nId). If we add a third member (e.g.: int m_nDept) in version 2, we need to decide what m_nDept should be initialized to when reading an older version of the object. In this example, we'll initialize m_nDept to -1 implying that the employee's department code is "Unknown".

C++
class CFoo
{
  ...
  // Data members
  public:
    CString  m_strName;  // employee name
    int      m_nId;      // employee id
    int      m_nDept;    // department code (-1 = unknown)
};

We also need to increase the object's version number in Foo.cpp to 2.

C++
const int CFoo::Version = 2;

Finally, we modify the part of serialize() that reads the object so that m_nDept is initialized to -1 if we're reading an older version of the datafile. Note that the file is always saved as the latest version.

C++
int CFoo::serialize
  (CArchive* pArchive)
{
  ...
  // Serialize the object ...
  ASSERT (pArchive != NULL);
  TRY
  {
    if (pArchive->IsStoring()) {
       ...
       // Write employee name, id and department code
       (*pArchive) << m_strName;
       (*pArchive) << m_nId;
       (*pArchive) << m_nDept;
    }
    else {
       ...
       // Read employee name and id
       (*pArchive) >> m_strName;
       (*pArchive) >> m_nId;

       // Read department code (new in version 2)
       if (nVersion >= 2) {
          (*pArchive) >> m_nDept;
       }
       else {
          m_nDept = -1; // unknown
       }
    }
  }
  CATCH_ALL (pException)
  {
    nStatus = bSignatureRead && bVersionRead ? ERROR : INVALID_DATAFILE;
  }
  END_CATCH_ALL

  return (nStatus);
}

Conclusion

So far, we've dealt with providing robust support for serializing simple objects - i.e., those that contain readily serializable data types. In Part 3, we'll see how to serialize any kind of object.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Technical Lead
Canada Canada
Ravi Bhavnani is an ardent fan of Microsoft technologies who loves building Windows apps, especially PIMs, system utilities, and things that go bump on the Internet. During his career, Ravi has developed expert systems, desktop imaging apps, marketing automation software, EDA tools, a platform to help people find, analyze and understand information, trading software for institutional investors and advanced data visualization solutions. He currently works for a company that provides enterprise workforce management solutions to large clients.

His interests include the .NET framework, reasoning systems, financial analysis and algorithmic trading, NLP, HCI and UI design. Ravi holds a BS in Physics and Math and an MS in Computer Science and was a Microsoft MVP (C++ and C# in 2006 and 2007). He is also the co-inventor of 3 patents on software security and generating data visualization dashboards. His claim to fame is that he crafted CodeProject's "joke" forum post icon.

Ravi's biggest fear is that one day he might actually get a life, although the chances of that happening seem extremely remote.

Comments and Discussions

 
GeneralForward file compatibility Pin
Ravi Bhavnani26-Feb-02 7:47
professionalRavi Bhavnani26-Feb-02 7:47 
GeneralRe: Forward file compatibility Pin
Tim Smith26-Feb-02 8:06
Tim Smith26-Feb-02 8:06 
GeneralRe: Forward file compatibility Pin
Ravi Bhavnani26-Feb-02 8:15
professionalRavi Bhavnani26-Feb-02 8:15 
GeneralRe: Forward file compatibility Pin
Tim Smith26-Feb-02 8:21
Tim Smith26-Feb-02 8:21 
GeneralRe: Forward file compatibility Pin
compiler2-Dec-02 7:31
compiler2-Dec-02 7:31 
GeneralRe: Dangers of serialization Pin
Zac Howland3-Feb-03 7:06
Zac Howland3-Feb-03 7:06 
GeneralRe: Dangers of serialization Pin
Navin3-Feb-03 7:25
Navin3-Feb-03 7:25 
GeneralRe: Dangers of serialization Pin
Zac Howland3-Feb-03 8:30
Zac Howland3-Feb-03 8:30 
Navin wrote:
Not necessarily. This is a perfect example of the real world, where program requirements can change.

You can have the best design in the world, but there's NO WAY you can anticipate every possible future requirement. The danger with serialization here is that you may try to jimmy new functionality into existing classes/code when in reality, you need to re-work some of your stuff. My point is that serialization can make changing the class structure next to impossible.


I too program in the real world, and ran into problems when people designed applications poorly before me. Requirements do change, but if you have a design that is not flexible enough to accomidate change, you are screwed from the beginning. And the example you gave is just an example of someone not looking at what classes were there before adding a new one. If they were similar, he probably should have just added it to the previous class, or left them as separate entities.

Changing the class structure is only done when someone royally screwed up in the design process (or some marketting genius decides to get a case of feature-creep).


Navin wrote:
How is a complete overhaul of an app going to change whether or not you need to support old versions?

My point still stands - if you have to have backwards and/or forwards compatibility, MFC serialization is probabably not a good choice.


I probably should have explained what I meant by "complete overhaul". I meant that you should talk to your boss, tell him that you want to completely revamp your application and that previous datafiles will no longer be supported in it. To accomidate previous versions, you can write a component (or set of components) to convert any of the required data from previous versions to the the format you application will use. (I would suggest using XML for the conversion output so that you can pick and chose what data you want more easily). MFC's serialization has no problem with backwards-compatibility as long as you implement it correctly. Forwards-compatibility is really an implementation problem (everything you write should be forwards-compatible).

My point is that if you properly design your application, changing data that is serialized to a file will not cause the problems you are talking about. There are many other solutions that are highly flexible ways of storing data (database, XML, INI, etc). All of these solutions have their place.

Zac

"If I create everything new, why would I want to delete anything?"
GeneralRe: Dangers of serialization Pin
Navin3-Feb-03 10:32
Navin3-Feb-03 10:32 
GeneralRe: Dangers of serialization Pin
Zac Howland3-Feb-03 10:54
Zac Howland3-Feb-03 10:54 
GeneralRe: Dangers of serialization Pin
Navin3-Feb-03 11:19
Navin3-Feb-03 11:19 
GeneralRe: Dangers of serialization Pin
Zac Howland3-Feb-03 11:34
Zac Howland3-Feb-03 11:34 
GeneralRe: Dangers of serialization Pin
Zac Howland3-Feb-03 7:01
Zac Howland3-Feb-03 7:01 
GeneralRe: Dangers of serialization Pin
Navin3-Feb-03 7:32
Navin3-Feb-03 7:32 
GeneralRe: Dangers of serialization Pin
Zac Howland3-Feb-03 8:39
Zac Howland3-Feb-03 8:39 
GeneralRe: Dangers of serialization Pin
Navin3-Feb-03 10:35
Navin3-Feb-03 10:35 
GeneralRe: Dangers of serialization Pin
Zac Howland3-Feb-03 11:00
Zac Howland3-Feb-03 11:00 
GeneralRe: Dangers of serialization Pin
Navin3-Feb-03 11:13
Navin3-Feb-03 11:13 
GeneralRe: Dangers of serialization Pin
Zac Howland3-Feb-03 11:26
Zac Howland3-Feb-03 11:26 
GeneralRe: Dangers of serialization Pin
Ravi Bhavnani4-Mar-03 4:46
professionalRavi Bhavnani4-Mar-03 4:46 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.