Click here to Skip to main content
15,881,757 members
Articles / Desktop Programming / MFC

Persistence is the Key

Rate me:
Please Sign up or sign in to vote.
4.73/5 (16 votes)
24 Jan 2013Ms-PL16 min read 79K   563   42  
Tutorial for using the Calvin C++ persistence library.
<!--=========================================================================
><!== INTRODUCTION
The Code Project article submission template (HTML version)
Using this template will help us post your article sooner. To use, just follow
the 3 easy steps below:
1. Fill in the article description details
2. Add links to your images and downloads
3. Include the main article text
That's all there is to it! All formatting will be done by our submission scripts and style sheets. -->
<!--=======================================================================-->
<!-- IGNORE THIS SECTION -->
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<meta content="HTML Tidy for Windows (vers 1st February 2003), see www.w3.org"
name="generator">
<title>Persistence is the Key</title>
<style type="text/css">
            BODY, P, TD { font-family: Verdana, Arial, Helvetica, sans-serif; font-size:
            10pt } 
            H2,H3,H4,H5 { color: #ff9900; font-weight: bold; }
            H2 { font-size: 13pt; }
            H3 { font-size: 12pt; }
            H4 { font-size: 10pt; color: black; }
            PRE { BACKGROUND-COLOR: #FBEDBB; FONT-FAMILY: "Courier New", Courier, mono;
                 WHITE-SPACE: pre; } 
            CODE { COLOR: #990000; FONT-FAMILY: "Courier New", Courier,
                 mono; }
            p.c1 {text-align: center}
            p.r1 {text-align: right}
</style>
<!-- <link rel="stylesheet" type="text/css"
        href="http://www.codeproject.com/styles/global.css"> >
<!--============================= STEP 1 =========================-->
<!-- Fill
        in the details (CodeProject will reformat this section for you) -->
</head>
<body>
<pre>
Title: Persistence is the Key
Author: Jay Kint
Email: icosahedron@gmail.com
Environment: VC++ 7.1
Keywords: C++, Persistence, STL
Level: Intermediate
Description: Instruction on using the Calvin persistence library
Section C++
SubSection General
</pre>
<!--============================= STEP 2 =========================-->
<!--
        Include download and sample image information. -->
<ul class="download">
<li><a href="http://www.hobbit-hole.org/calvin/Calvin.zip">Download source and test project - 162 Kb</a></li>
</ul>
<p><img src="calvin.jpg" alt="President Calvin Coolidge">
</p>
<!--============================= STEP 3 =========================-->
<!--
        Add the article text. Please use simple formatting (<h2>, <p>
 etc) -->
<h1>Persistence is the Key</h1>
<h2>Introduction</h2>
<p>
<em>"Nothing in the world can take the place of persistence. Talent will
not; nothing is more common than unsuccessful men with talent. Genius will not;
unrewarded genius is almost a proverb. Education will not; the world is full of
educated derelicts. Persistence and determination alone are omnipotent. The
slogan 'Press On' has solved and always will solve the problems of the human
race."</em><br>
<br>
</p><p class="r1">-- Calvin Coolidge</p>
<p>
Calvin is a C++ persistence library or framework that allows programmers to
easily save and load objects using keys. Objects are associated with a user
configurable type of key that can be used to name objects specifically and save
them or load them by that name.
</p>
<p>
This feature is the primary distinguishing difference between Calvin and the
many other persistent libraries.
</p>
A quick snippet shows what this means: <br>
<pre>
// archives hold named objects like a database
filesys_archive ar( "../data" );

// name of the variable c is "c"
boost::shared_ptr&lt;C&gt; c( new C( "c" ));

// save the object
ar.save( c );

// delete the object c
c.reset();

// load it from the archive
c = ar.load&lt;C&gt;( "c" );
</pre>
<p>
Calvin has most of the features you would expect in a persistence
library. It's relatively painless to add persistence to your objects, with the
only overhead being the addition of the name member to your objects.  The above
snippet uses a string, but virtually any value type<sup>1</sup> can be used as a
key.
</p>
<p>
This article assumes a working knowledge of C++ and how to use templates.  The
accompanying code has only been tested on VC 7.1, though I suspect it would work
on either the Comeau or gcc 3.3+ compilers.
</p>
<h2>Background</h2>
<p>
There are many persistence libraries available for C++ programmers, so why
write another one? The short answer is that no library had the features I needed
and to revise any of them would probably have taken more time than rolling my 
own. Calvin isn't a large library.
</p>
<p>
That's not to say that I don't owe those other libraries an intellectual
debt.  Calvin builds on their ideas to enable the features that I needed.
Perhaps you might need these features as well.
<p>
As you can see in the above snippet, Calvin allows a programmer to name
instances and then save and load them. Of what value is this feature? To see
the difference, consider what most other persistence frameworks do (or don't do
as the case may be).
</p>
<p>
Other persistence frameworks are little more than serialization of an object
into an alternate form. This is good for allowing an object to be sent across
the wire or simply dumped to disk. But what if that object contains references
to other objects? Usually these contained objects are serialized as well within
the original object. Say you have objects A, B and C. B contains a pointer to
A, as does C.
</p>
<p class="c1"><img src="BandCshareA.gif" alt=
"Figure 1 A is shared by B and C">
</p>
<p>
You serialize B, it in turn serializes A and both are dumped to their store.
You do likewise for C, and all is good with the world.
</p>
<p class="c1"><img src="BandConDisk.gif" alt=
"Figure 2 A is stored with B and C on disk">
</p>
<p>
But what happens when you load B and C from the store? In most libraries,
you would end up with two objects of type A that B and C each reference now.
How to get around this duplication problem?
</p>
<p class="c1"><img src="BandCownA.gif" alt=
"Figure 3 A is duplicated when B and C are restored">
</p>
<p>
There are two commonly implemented solutions to this problem.
</p>
<p>
The first and most common is to require a root object that contains all the
objects, such as a document or a 3d scene. This root object is the only object
that may start a persistence operation, thereby eliminating the duplicate
references outside of a single archive. This really isn't a solution but a
constraint to the problem above.
</p>
<p>
Another solution is to allocate all objects to be persisted from a special
pool, and then the pool is what is saved. This really doesn't eliminate the
root object, but simply shifts it, making the pool the mandatory root object to
be persisted.
</p>
<p>
Many applications don't find this prohibiting and can operate well within these
constraints. Some applications, such as the ones I write, share data across documents
quite a bit, and if one piece of shared data is updated or changed, that change
should be reflected in all the other documents. Calvin's solution is to allow
shared instances to be named, and to then save each instance in its own record
within an archive.
</p>
<p>
Therefore, in our scenario above, when B is saved, so is A, but A is stored
on its own record and a reference to A is saved with B. Likewise, when C is saved, A
is saved and C saves a reference to A. When C is loaded, it loads A. When B is
loaded, Calvin notices that A has already been loaded and a reference to A is
returned to B, so that B and C once again reference the same object.
</p>
<p class="c1"><img src="BandCandAonDisk.gif" alt=
"Figure 4 A is stored separately from B and C">
</p>
<p>
This type of late binding allows me to write utilities that operate on certain 
objects, such as textures, without having to know about which 3d models use it.
</p>
<h2>Using Calvin</h2>
<p>
To use Calvin in your own applications requires three steps. First you must
outfit your classes so they may be persisted. Second you must name your
instances when created. Last, write the code to load and save your objects in
the appropriate places in your application. Let's look at how these are done in
that order.
</p>
<h3>A Persistent Class</h3>
<p>
As nice as it would be to gain persistence with no additional code, C++ just
doesn't have the facilities necessary to do it. Some scaffolding is necessary
to achieve persistence.
</p>
<p>
The first question to ask yourself is if the class must be made persistent
in the first place? While lightweight, there is still some
overhead involved.  If yes, then will it need its own name? If it is an object
potentially referenced by more than one object, then yes, it will need a name.
Otherwise, it may be persisted inside a containing object (as Figure 2
above).
</p>
<p>
But what type to use for naming? <code>std::string</code> is my own
preferred type, but by the virtue of template parameters, a name may be any
value type that may be converted to a string via the <code>&lt;&lt;</code> operator.
Integers meet these requirements and only add 4 bytes of overhead. And if you
keep your integer names as an <code>enum</code> within a single .h file, it is 
very quick and convenient too.  (The test/example program in the accompanying
source code gives an example of strings and ints as keys.)
</p>
<p>
If you're satisfied with your answers to these questions, then comes the
relatively painless part of making the class potentially persistent.
</p>
<pre>
// an example persistent class
// 1. include calvin.h
#include "calvin.h"

// 2. Inherit from calvin::persistent&lt;key&gt;
class A : public calvin::persistent&lt;std::string&gt; {
    int a;
    float a2;
    double a3;
    struct Aa {
        int a4;
        int a5;

        Aa(void) : a4(4), a5(5) {}
    };

    Aa a6;
public:
    // 3. Default constructor and constructor with the parameter of the key
    A( void ) : a(1), a2(2.0f), a3(3.0), a6(Aa())
    {
    }
    A( const std::string&amp; name ) :
        persistent&lt;std::string&gt;(name), a(1), a2(2.0f), a3(3.0), a6(Aa())
    {
    }
    virtual ~A( void )
    {
    }
protected:
    // 4. serialize method (used for both reading and writing)
    template &lt;typename Stream&gt;
    Stream&amp; serialize( Stream&amp; s, const unsigned int version )
    {
        return s ^ a ^ a2 ^ a3 ^ a6.a4 ^ a6.a5;
    }
private:
    // 5. friendship of allow_persistence
    friend calvin::allow_persistence&lt;A&gt;;
    // 6. version of the class
    static const int version_ = 1;
};
</pre>
<p>
As you can see, making a class persistent is simple. Five
alterations and your class is ready to be saved to or loaded from an
<code>archive</code> (which we will discuss further below).
</p>
<ol>
<li><code>calvin.h</code> contains all the code necessary to make a class
persistent.<br></li>
<li>The class itself must then inherit from <code>calvin::persistent</code>
with the template parameter being the key type that names the object. Requiring
inheritance rather than just using convention does two things; it provides the
necessary base variable <code>_name</code> , as well as type information used
by the library to know how to handle objects, whether they should be written as
part of their containing object or handled as independently named objects.</li>
<li>The class is required to have a default constructor (one with a
parameter list of type <code>void</code> ) and one that takes a single parameter
of the type used as a key. These are used by the library when restoring objects
from persisted state.</li>
<li>The workhorse of persistence is the <code>serialize</code> method. The same
method may be used for saving and loading through the use of the overloaded
<code>^</code> operator. As well, the version parameter allows a class to
update and change while maintaining backward compatibility.  <i>Order is 
important in this method.</i></li>
<li>This friendship works around many issues and provides the Calvin library
with necessary access to members even if they are private. It's merely
for convenience, but is well worth the one line of code.</li>
<li>The version of the class, persisted with the class so that
backwards compatibility may be preserved.</li>
</ol>
<p>
There are some additional features of the library that can be used in place
of the steps above, usually to solve specific problems that the more general
features might not allow.
</p>
<p>
It is a matter of convenience that the library allows you to use a single
method for saving and loading objects. Sometimes this is not practical or even
possible. In this case, the <code>serialize</code> method may be split into two
as so:
</p>
<pre>
std::ostream&amp; serialize( std::ostream&amp;, unsigned int );
std::istream&amp; serialize( std::istream&amp;, unsigned int );
</pre>
<p>
Though it probably goes without saying, the method taking a
<code>std::ostream&amp;</code> as its parameter is the one called for saving,
and the method taking a <code>std::istream&amp;</code> parameter is the one
used for loading. Also note the lack of the template parameter. Each may
perform the operations necessary to save or restore the object. These methods
should use the <code>^</code> operator as used in the generic method example
rather than the normal <code>&gt;&gt;</code> and <code>&lt;&lt;</code> operators.
</p>
<p>
To demonstrate the other features, let's examine a class that builds on the
class A above.
</p>
<pre>
struct made_of_prims {
    int i;
    float f;
    double d;
};

// 1. An unnamed yet persisted class
class persistent_void_test : public calvin::persistent&lt;void&gt; {
      std::string msg;
friend struct calvin::allow_persistence&amp;ltvoid&gt;;
public:
    persistent_void_test( void ) : msg( "I'm a calvin::persistent&lt;void&gt;" ) {}
    template&lt;typename Stream&gt;
    Stream&amp; serialize( Stream&amp; s, unsigned int version )
    {
        return s ^ msg;
    }
};

// 2. Subclass of a persisted class
class B : public A {
public:
    B( void ) : A(), b1(0), b2(1), stupid(NULL) {}
    B( const std::string&amp; name ) :  A(name), b1(0), b2(1), stupid(NULL) {}
    B( const std::string&amp; name, const char* stupid ) :
    A(name), b1(0), b2(1), stupid(stupid)
    {
        return;
    }
    virtual ~B( void ) {}
    void add( made_of_prims&amp; p)
    {
        vec_of_prims.push_back( p );
    }
private:
    int b1;
    unsigned int b2;
    std::vector&lt;made_of_prims&gt; vec_of_prims;
    persistent_void_test sm;
    const char* stupid;
    template &lt;typename Stream&gt;
    Stream&amp; serialize( Stream&amp; s, const unsigned int version )
    {
        // 3. call base class serialize method directly
        return A::serialize( s, version ) ^
        b1 ^
        b2 ^
        // 4. STL containers supported directly
        vec_of_prims ^
        sm ^
        // 5. PtrArray used to persist arrays
        PtrArray&lt;const char&gt;( stupid, (stupid == NULlL) ?
        0 : (unsigned int) strlen(stupid)+1 /*null terminator*/);
    }
// 6. Friendship still needs to be granted even in subclasses
friend calvin::allow_persistence&lt;B&gt;;
};
</pre>
The above example highlights some additional features and requirements of the
library when making your class persistent. 
<ol>
<li>The type <code>calvin::persistent&lt;void&gt;</code> is a special type that
allows a class to not have a name, but still have a <code>serialize</code> method be called.
Since the object does not have a name, it is stored inside its containing
object, and therefore may not be the base class for a shared object.</li>
<li>Calvin works with all the objects in your hierarchy no matter how
deeply subclassed.</li>
<li>To use it though, call any base class <code>serialize</code> methods
directly and before serializing your own members.</li>
<li>Some of the STL containers are automatically handled. As of this
writing, Calvin knows about <code>vectors</code> , <code>lists</code> , and
<code>deques</code>. Other container types should be trivial to add and will
be as necessity dictates.</li>
<li>Use the helper <code>PtrArray</code> to handle arrays. This is to give
Calvin a size to use since it can't be known from the array itself.  STL
containers such as <code>vector</code> are preferred by Calvin. </li>
<li>Even in subclasses, friendship must be explicitly granted since
friendship is not transferred to subclasses.</li>
<li>An additional note, not illustrated above, is that
<code>boost::shared_ptr</code> is supported. <code>boost::shared_array</code>
is not yet supported.</li>
</ol>
<h4>Calvin and Types</h4>
<p>
The examples above demonstrate how most types of data are supported, but here
is a more thorough reference to how Calvin handles members of different data 
types in a <code>serialize</code> method.
</p>
<ul>
<li>
A value type, such as a plain old data (meaning a built in data type) or 
structure of value types is stored in the record of the named object.  (see 
<code>calvin::persistent&lt;void&gt;</code> below for structures that are not
collections of value types).
</li>
<li>
A pointer is dereferenced and the value stored in the record as is.  Upon loading,
if the pointer is NULL, it is allocated with <code>new</code> and then restored,
otherwise it is assumed to point to an allocation of enough size to take the value.
</li>
<li>
A pointer to a pointer (to a pointer, etc.) is dereferenced ad nauseum until the
value type is procured.  Calvin must assume that space has already
been allocated.  This can be done in the serialize method before reading the
pointer if necessary.
</li>
<li>
An array of known size (declared as <code>T[#]</code>) is stored in the record
of the object, its count stored with it.  On a load, if the number in the stream
is greater than the declared size, an error is signaled.
</li>
<li>
<code>boost::shared_ptr</code> to non Calvin objects are treated as pointers.
</li>
<li>
<code>string</code> is handled as a value type rather than just as an 
array of characters.
</li>
<li>Standard C++ containers are handled element by element.  How and where 
each element is handled is determined by its type.  For now only
vectors, lists, and deques are supported, though using the code in <code>calvin.h</code> as
a template (no pun intended), any standard container should be simple to persist.
</li>
<li>
<code>calvin::persistent</code> value objects with a key equal to a default value
(using the special template construct T()) have their serialize method
invoked with the same stream as their containing object so that they are persisted
in the same record.
</li>
<li>
A structure or class that contains a pointer or container cannot be considered
a value type and therefore cannot be persisted without inheriting from
<code>calvin::persistent</code>.   It is not always desireable for a persistent
class to contain the overhead of a key though.  The <code>calvin::persistent&lt;void&gt;</code>
class was created for this purpose.  A structure that inherits from 
<code>calvin::persistent&lt;void&gt;</code> has a <code>serialize</code> method that is 
called to handle its members, but doesn't have the overhead of a key.  They are
persisted to and from the store of the containing object.
</li>
<li>
Named objects must be stored using <code>boost::shared_ptr</code>.  This is so that
Calvin can return pointers to already loaded objects and track them to know
when an object's lifetime is expired.
</li>
</ul>
<p>
Congratulations on having made it this far.  Enabling classes for persistence is
the most complicated part of using the Calvin library.  Just persist a little
more.
</p>
<h3>"What's in a name?"</h3>
<p>
Instances to be persisted need to be named.  Naming your instances is as 
straightforward as the snippet at the beginning of
the article shows. Simply invoke the object's constructor that takes a name when
the object is created, or use the <code>set_name</code> function later to give 
it a name.  That's it.
</p>
<p>
What are legal names for instances?  As mentioned above, it can be any value
type that can be converted to a string via the &lt;&lt; operator.  Also, the
name must be compatible with the archive selected.  See the documentation for
an <code>archive</code> type for what is considered legal.
</p>
<p>
Names must be unique for each instance.  They distinguish instances
within the store and the program.  As persistent objects are created or
loaded, their names are added to a registry.  When a second attempt to load
an object by its name is attempted, the original instance is returned instead
of a new instance.  Calvin reports an error when an object is created
with a name already used (see Error Handling in Calvin below).
</p>
<p>To make unique names to instances possible, Calvin works with <code>boost::shared_ptr</code>
exclusively.  This way named objects have an automatic tracking mechanism so
that when they are deleted, Calvin can know to load one at the next request.
Why doesn't Calvin just keep a permanent reference?  Memory issues mostly, but
I think that a library shouldn't do things such as that behind the curtains when
the facility of another excellent library can already do it with minimal fuss.
</p>
<h3>"Where do you want this?"</h3>
<p>
Where do these objects go when they are persisted?  They go into archives.
Archives are collections of named objects.  Archives can conceivably be the
front end for any type of store such as files or a database.  For now, the
only implemented archive is the <code>file_archive</code>.  <code>file_archive</code>
has a template parameter that must match the key type of the objects to be 
written and read using it.  A convenient <code>filesys_archive</code> is declared
as a <code>file_archive&lt;string&gt;</code>.
</p>
<p>
As its name implies, the <code>filesys_archive</code> simply stores objects in files named
the names given to them in the program.  For this reason, when using a 
<code>filesys_archive</code>, names must consist of only valid filename characters, which 
depends on your particular operating system.  A good rule of thumb is to stick
with alphanumeric, '.', and '_' if using .
</p>
<p>
To use a <code>filesys_archive</code> (or <code>file_archive</code>), include 
<code>fs_archive.h</code> in the appropriate .cpp file and declare your archive 
with a single string parameter representing the root directory, as so.
</p>
<pre>
    filesys_archive ar( "../data" );
</pre>
<p>
To initiate a save requires a <code>boost::shared_ptr</code> to a named
object.
</p><pre>    boost::shared_ptr&lt;MyPersistentClass&gt; p( "p" ); // "p" is the name of p
    
    // ... do something here with p ...

    ar.save( p );
</pre>
<p>
In the above you should see a file named "p" in the directory "data".
</p>
<p>
To use the object later, an archive loads an object given a name and returns
a <code>boost::shared_ptr</code> to the object.
</p>
<pre>
     boost::shared_ptr&lt;MyPersistentClass&gt; p;
     p = ar.load( "p" );
</pre>

<h3>Error Handling in Calvin</h3>
<p>
Exceptions or error codes?  That seems to be question.  Exceptions are the 
standard error reporting mechanism in C++, but there are many valid 
reasons for avoiding them.  In this regard, I took the route that the Boost libraries
did and give the programmer an option. 
</p>
<p>
Depending on the value of the <code>NO_CALVIN_EXCEPTIONS</code> macro at compile time, either 
an exception (of type <code>calvin_exception</code>) is thrown, or the 
function <code>calvin::throw_exception</code> is called.  If the function option is
chosen, then the programmer must define a function by that name to handle any 
errors.  The function should take a single parameter of type <code>calvin_exception</code>.
</p>
<p>
<code>calvin_exception</code> is a subclass of <code>std::exception</code>, and
uses the what parameter to store the cause.  Call <code>calvin_exception.what()</code>
to see the error string.
</p>
<h2>To the Persistent Go the Spoils!</h2>
<p>
That's it for using Calvin.  I've been using Calvin in my own applications for 
several months now with nary a problem. Then again, I know it and tend to 
perhaps skirt its warts unintentionally.  It should be considered 0.1 software.
I would welcome any bug fixes.
</p>
<p>
Calvin isn't done yet though.  Current plans to extend it are to include a 
zip archive and an xml archive.  The next article will be a tutorial on creating 
new archives by subclassing stream buffers.  If you'd like a deeper explanation
on the inner workings of Calvin, <a href="mailto:articles@hobbit-hole.org">e-mail</a>
me and I might write up another article talking about the template type matching
that makes Calvin work.
</p>
<p>
In the meantime, peruse the source code.  It's compact, simple, and pretty 
straightforward.  I've included it here and written this article so that hopefully
you can use it and extend it to suit your needs.  The example/test program 
included with this article requires that Boost version 1.31 or later be
installed in the calvin subdirectory.  Also, the Boost filesystem library
should be built and available to the test program for linking.
</p>
<p>
Calvin is copyright by myself, Jay Kint, and licensed for use under the <a href="http://www.boost.org/LICENSE_1_0.txt">Boost 
license</a>.  It should be considered "as-is" and no warranty is intended or implied.
</p>
<h2>Footnotes</h2>
<p>
<sup>1</sup>A value type is one that isn't a pointer or a reference.  If you
can assign from one variable to a second variable, assign something else to the
first variable, and the second variable hasn't changed, then those variables
types are considered <em>value types</em>.  For example:
</p><pre>// value type pseudo code
T x, y;
x = something;
y = x;
x = something_else;
print x;
print y;
</pre>
<p>
If T is a value type, then <code>x</code> should print "something_else" and 
<code>y</code> should print "something".  If <code>x</code> and <code>y</code>
 print "something_else", then T is not a value type.
</p>
<h2>References</h2>
<ol>
<li>
<a href="http://www.boost.org/">Boost</a>.  Specifically the serialization library,
type traits, shared pointers, and filesystem libraries.  Lots of other good things are
there as well.  If you don't know the boost libraries, check them out.
</li>
<li>
<a href="http://eternity-it.sf.net/">Eternity</a>.  Great little persistence library
that was my first choice when choosing one, until I started running into the 
limitations that prompted Calvin.
</li>
<li>
<a href="http://www.microsoft.com/msj/archive/S385.aspx">Holub, Allen; MSJ, June 1996</a>.
Every persistence library I've run across has mentioned this article as
a source.  Worth the read just for an idea of what should be in a persistence
library whether you're a user or designer.
</li>
</ol>
<h2>History</h2>
<!--=============================    That's it!   =========================-->
<p>1/29/2005 - first draft posted for review</p>
<p>2/5/2005 - second draft posted for review</p>
<p>2/8/2005 - article submitted to CodeProject</p>
</body></html>

By viewing downloads associated with this article you agree to the Terms of Service and the article's licence.

If a file you wish to view isn't highlighted, and is a text file (not binary), please let us know and we'll add colourisation support for it.

License

This article, along with any associated source code and files, is licensed under The Microsoft Public License (Ms-PL)


Written By
Web Developer
United States United States
Jay Kint has been a software engineer/hacker for most of his life, starting with 8 bit Ataris. After a lengthy stint in the game industry, he now works at Microsoft in SQL Server.

Comments and Discussions