|
The serialization scheme described here is independent of MS macros. It happens to use MFC classes like CArchive and CFile but can (and has) easily be ported to other platforms.
The intent was to explain serialization as an independently implementable feature, and show that it can be used for any object.
/ravi
"There is always one more bug..."
http://www.ravib.com
ravib@ravib.com
|
|
|
|
|
is using nVersion and what you have discussed a viable option in CDocument derived classes? i see that CDocument doesn't have the call:
IMPLEMENT_SERIAL (CMyObject, CObject, VERSIONABLE_SCHEMA|1)
so im looking for another way to implement versions in my document and this looks like what i need, is this correct?
-dz
|
|
|
|
|
Yes. The serialization scheme described here can be used to serialize any kind of object. Remember, serialization is neither magical nor Microsoft-ish. It's just a way to safely save and restore data to and from persistent storage.
/ravi
"There is always one more bug..."
http://www.ravib.com
ravib@ravib.com
|
|
|
|
|
Hi,
your comment made me look deeper in the serialization process provided by MFC. In the first moment I though, great, VERSIONABLE_SCHEMA is exactly what I needed. But after a closer look, I came to the conclusion, that this is a nice feature, but that it does not help in more complex situation apart from simple object serialization.
I found a wonderful explanation of the problems I focused when using the this feature at
http://archive.devx.com/free/mgznarch/vcdj/1997/nov97/serial5.asp
The point is, that the versioning provided by MFC doesn't work in object hierarchies with independent versioning and second, you can not use this feature when you call CMyObject::Serialize directly.
Dirk
|
|
|
|
|
luedi wrote:
I found a wonderful explanation of the problems I focused when using the this feature at
http://archive.devx.com/free/mgznarch/vcdj/1997/nov97/serial5.asp
The point is, that the versioning provided by MFC doesn't work in object hierarchies with independent versioning and second, you can not use this feature when you call CMyObject::Serialize directly.
I could be wrong, but I believe the issue mentioned in that article was fixed in VC++ 6. As for not being able to call Serialize directly, that is documented and has to do with process of reading/writing an object. If you serialize an object via
myObject.Serialize(ar)
it only performs the operations in that function. However, if you serialize via
ar >> pmyObject;
the WriteObject function is called that writes the schema for the object in the file before calling Serialize.
Zac
"If I create everything new, why would I want to delete anything?"
|
|
|
|
|
The versionable schema doesn't work properly in VC6.0. Forget about using it. Using your own Schema variable is a much better solution, but you should use it for the first version of the class as well.
For the rest, I stick to to the standard VC Serialization function.It works reasoably weell and I don't see any use in rewriting the MFC functions and macros yourself.
|
|
|
|
|
This is a good article, and explains well how serialization works. Perhaps what I'm posting is outside the scope of the article (more of a design decision rather than a technical how-to), but it is something I have run into when doing serialization of complex objects. While this does allow a nice, neat object-oriented way to save/restore objects, it can be inflexible. As you have shown, you can add/remove members from a class and use a form or versioning to allow backwards compatiblity. But there are some things that are difficult or impossible:
Forward compatiblity. Many people may not care, but serializing in this way essentially changes your file format every time the objects change.
You become locked into an object structure. Changing or adding members here and there works, but what if something major happens - e.g., a base class changes? Or you find you have two classes that didn't derive from a base class before, but now you want to break out a base class?
Perhaps a part 4 to this series, should you use serialization or not? Just my two cents...
The early bird may get the worm, but the second mouse gets the cheese.
|
|
|
|
|
Navin wrote:
Many people may not care, but serializing in this way essentially changes your file format every time the objects change.
Yes. The serialization scheme presented in this series asserts that a properly serialized object will always be stored (according to the most current schema). This is expected behavior.
Navin wrote:
what if something major happens - e.g., a base class changes?
This is gracefully handled, as shown in Part 3 of the tutorial. Part 3 is crying out for source code! I'll update the article in a few weeks.
Imho, I don't see a danger to structured serialization. If this is not what you want to do, then an alternative approach (eg: saving fragmented information an in .INI file) may be employed. But if you want to save/restore data without having to worry about backward compatibility of objects, then serialization (as presented here) should work fine.
Aside: This scheme has been used successfully in products (whose class structures have evolved over time) that are distributed in very large quantities (millions). The scheme is very robust (i.e. almost 100% of run-time read/write errors have been caught and signalled by the app).
/ravi
"There is always one more bug..."
http://www.ravib.com
ravib@ravib.com
|
|
|
|
|
Ravi Bhavnani wrote:
what if something major happens - e.g., a base class changes?
This is gracefully handled, as shown in Part 3 of the tutorial. Part 3 is crying out for source code! I'll update the article in a few weeks.
The biggest problem is, IMHO, backwards compatibility issues.
I may not have gotten my point across with this one. I am talking about this case:
1. You design a class... say, CFoo. It has a serialize function. All is good.
2. Later on, you design a class, CBar and do its serialization.
3. Somebody discovers that CFoo and CBar have a lot of common functionality, and that a bast class, CBaz, shold be created for both.
It seems this cannot be done in a graceful manner and still support backwards compatibility. Once you start serializing the CBaz as part of CFoo and CBar, the whole schema is broken. You have to do something odd like serialize CBaz's members from the CFoo and CBar classes, which breaks encapsulation. More complicated scenarios would be more difficult to solve.
I guess my point is that serialization works and is robust, but if you care about backwards and forwards file compatibility, it can be fairly inflexible since file formats are tied to your object structure.
You are right in that it will work great if compatibility with previous/future versions is not a problem. But alas, some of us don't have that luxury...
The early bird may get the worm, but the second mouse gets the cheese.
|
|
|
|
|
You're right. A change of this magnitude will not be automagically supported. There are many workarounds, one of which is defining new objects CFoo2 and CBar2 .
When you said a base class changed, I thought you meant the composition of CFoo 's base class may have changed. These kinds of mods are supported, but I'm sure you knew that!
/ravi
"There is always one more bug..."
http://www.ravib.com
ravib@ravib.com
|
|
|
|
|
Navin wrote:
forwards file compatibility
Yes, only backward compatibility is supported. I don't know of any apps that can read new versions of data stores, although this can be done by using dynamic schemas (eg: XML).
/ravi
"There is always one more bug..."
http://www.ravib.com
ravib@ravib.com
|
|
|
|
|
Our product does this just fine.
Of course, there is no magic bullet. Even though build 90 of OmniServer might be able to read a build 98 data file, obviously any data pertaining to new features would be ignored.
Here is how our system works in general.
Each serialization stream generated by an object is encapsulated with a type identifier, a version, and a stream length. When the data is de-serialized (is that even a word?), the encapsulated serialization buffer is passed to the object. Since it is a requirement that any new data be added at the end of a serialization buffer, an older version of the product might only read part of the buffer being de-serialized. Thus, after the object is finished with the serialization buffer, the routine performing the de-serialization of the entire stream just uses the length of the encapsulated serialization buffer to locate the start of the next buffer. Thus, older versions can read newer data files.
But of course, there are SIGNIFICANT limitations. When we did V2 of our product, we decided to rework the serialization buffers and didn't support V1 reading V2 data files.
Tim Smith
I know what you're thinking punk, you're thinking did he spell check this document? Well, to tell you the truth I kinda forgot myself in all this excitement. But being this here's CodeProject, the most powerful forums in the world and would blow your head clean off, you've got to ask yourself one question, Do I feel lucky? Well do ya punk?
|
|
|
|
|
Yep, that's a dynamic schema like XML, which will only read what it can and safely ignore the rest.
Tim Smith wrote:
de-serialized (is that even a word?),
Sure is!
/ravi
"There is always one more bug..."
http://www.ravib.com
ravib@ravib.com
|
|
|
|
|
Damn, we should have patented it way back in 1994. Then we would sue and collect money from everyone using XML!!!!
RIIIIIIIIIIIIIIGHT.....
Tim Smith
I know what you're thinking punk, you're thinking did he spell check this document? Well, to tell you the truth I kinda forgot myself in all this excitement. But being this here's CodeProject, the most powerful forums in the world and would blow your head clean off, you've got to ask yourself one question, Do I feel lucky? Well do ya punk?
|
|
|
|
|
How do you store object references in such a scheme? Obviously, you can't just in-line the referenced object like MFC serialization does the first time a reference is seen. If you did, you'd be intermingling data from multiple objects. So, you either store a reference ID and add the object to a queue, or use a multi-pass serialization technique. In either case your data format would not end up being very nice for streaming over a thin pipe or progressive/partial loading.
My point is that there are tradeoffs in all approaches (as someone mentioned, "no silver bullets") and the greatest strength of the one described in the article is simplicity. Simple is good. Personally, I like a multi-pass approach since it provides an opportunity for all kinds of interesting optimizations.
|
|
|
|
|
Navin wrote:
I may not have gotten my point across with this one. I am talking about this case:
1. You design a class... say, CFoo. It has a serialize function. All is good.
2. Later on, you design a class, CBar and do its serialization.
3. Somebody discovers that CFoo and CBar have a lot of common functionality, and that a bast class, CBaz, shold be created for both.
This is a perfect example of poor design. You will run into far more dangerous problems than serialization issues with this type of activity.
Navin wrote:
You are right in that it will work great if compatibility with previous/future versions is not a problem. But alas, some of us don't have that luxury...
If you don't have this luxury, it is time to go to your boss and suggest a complete overhall of the software application. Implement a REAL design process and document the code, and you will not run into this issue.
Zac
"If I create everything new, why would I want to delete anything?"
|
|
|
|
|
Zac Howland wrote:
This is a perfect example of poor design. You will run into far more dangerous problems than serialization issues with this type of activity.
Not necessarily. This is a perfect example of the real world, where program requirements can change.
You can have the best design in the world, but there's NO WAY you can anticipate every possible future requirement. The danger with serialization here is that you may try to jimmy new functionality into existing classes/code when in reality, you need to re-work some of your stuff. My point is that serialization can make changing the class structure next to impossible.
Zac Howland wrote:
If you don't have this luxury, it is time to go to your boss and suggest a complete overhall of the software application. Implement a REAL design process and document the code, and you will not run into this issue.
How is a complete overhaul of an app going to change whether or not you need to support old versions?
My point still stands - if you have to have backwards and/or forwards compatibility, MFC serialization is probabably not a good choice.
You can pick your friends, and you can pick your nose, but you can't pick your friend's nose.
|
|
|
|
|
Navin wrote:
Not necessarily. This is a perfect example of the real world, where program requirements can change.
You can have the best design in the world, but there's NO WAY you can anticipate every possible future requirement. The danger with serialization here is that you may try to jimmy new functionality into existing classes/code when in reality, you need to re-work some of your stuff. My point is that serialization can make changing the class structure next to impossible.
I too program in the real world, and ran into problems when people designed applications poorly before me. Requirements do change, but if you have a design that is not flexible enough to accomidate change, you are screwed from the beginning. And the example you gave is just an example of someone not looking at what classes were there before adding a new one. If they were similar, he probably should have just added it to the previous class, or left them as separate entities.
Changing the class structure is only done when someone royally screwed up in the design process (or some marketting genius decides to get a case of feature-creep).
Navin wrote:
How is a complete overhaul of an app going to change whether or not you need to support old versions?
My point still stands - if you have to have backwards and/or forwards compatibility, MFC serialization is probabably not a good choice.
I probably should have explained what I meant by "complete overhaul". I meant that you should talk to your boss, tell him that you want to completely revamp your application and that previous datafiles will no longer be supported in it. To accomidate previous versions, you can write a component (or set of components) to convert any of the required data from previous versions to the the format you application will use. (I would suggest using XML for the conversion output so that you can pick and chose what data you want more easily). MFC's serialization has no problem with backwards-compatibility as long as you implement it correctly. Forwards-compatibility is really an implementation problem (everything you write should be forwards-compatible).
My point is that if you properly design your application, changing data that is serialized to a file will not cause the problems you are talking about. There are many other solutions that are highly flexible ways of storing data (database, XML, INI, etc). All of these solutions have their place.
Zac
"If I create everything new, why would I want to delete anything?"
|
|
|
|
|
Zac Howland wrote:
Changing the class structure is only done when someone royally screwed up in the design process (or some marketting genius decides to get a case of feature-creep).
This is where we disagree. Changing the class structure is something that can - and should - happen as your program changes and evolves.
Zac Howland wrote:
My point is that if you properly design your application, changing data that is serialized to a file will not cause the problems you are talking about. There are many other solutions that are highly flexible ways of storing data (database, XML, INI, etc). All of these solutions have their place.
But we program in the real world. I'd rather use a more flexible format from the beginning (e.g., not serialization) than screw over all the customers every so often by not supporting their old formats.
You can pick your friends, and you can pick your nose, but you can't pick your friend's nose.
|
|
|
|
|
Navin wrote:
This is where we disagree. Changing the class structure is something that can - and should - happen as your program changes and evolves.
I completely disagree here. You should be able to add to/remove data from a class, add a class etc. But as for changing the base class for existing classes? That falls under 1 of 2 categories 1) poor design 2) poor documentation for the programmer implementing the objects.
Navin wrote:
But we program in the real world. I'd rather use a more flexible format from the beginning (e.g., not serialization) than screw over all the customers every so often by not supporting their old formats.
Making sure you have conversion routines does not screw over your customers. Keeping around old "legacy" stuff is the reason we had DOS as the backbone for Windows all the way through Windows 98! In my book, that has always been a poor argument made by sales and marketting people. Storing data in a binary format until you completely rewrite the application to take advantage or new technologies or add new features is a perfectly viable option. At the same time, when you completely rewrite the app, it is perfectly acceptable to distribute conversion routines to allow the new app to store the data in whatever format it uses. Even if you use XML or INI, if you change the names or tags, or whether something is stored as an element or an attribute, you will be "breaking" previous versions. The point is to make sure that you design the application from the start so that you don't run into that problem, and to code properly when making changes to avoid this problem. The fact that data is serialized or stored in a database, or in an XML file, or an INI file makes no difference.
Zac
|
|
|
|
|
Zac Howland wrote:
That falls under 1 of 2 categories 1) poor design 2) poor documentation for the programmer implementing the objects.
Or 3, your program requirements changed. Which they always do, in ways you can't possibly imagine or anticipate in the future.
Zac Howland wrote:
The fact that data is serialized or stored in a database, or in an XML file, or an INI file makes no difference.
Yes it does. If you have an INI or XML file, you can easly have a routine that reads "legacy" files, and puts the data into whatever objects you feel necessary. With serialization, it is very difficult to do, you would have to either keep around a set of old classes that do nothing but the serialization, or look up the internal binary format and pray that you read in your stuff correctly.
(Stand-alone conversion apps are pains in the butt, btw.)
My point being - if you don't think your program will ever change significantly, or backwards/forwards compatibility is not an issue, then object oriented MFC serialization is fine. Otherwise, use something more flexible.
You can pick your friends, and you can pick your nose, but you can't pick your friend's nose.
|
|
|
|
|
Navin wrote:
Or 3, your program requirements changed. Which they always do, in ways you can't possibly imagine or anticipate in the future.
If your program requirements changed so dramatically that it affected your class structure that much, there are definitely some design issues there. You cannot predict the future, but you can plan for probable changes. And if it comes after the product is fully defined, then you are letting your marketting department have too much fun with you.
Navin wrote:
Yes it does. If you have an INI or XML file, you can easly have a routine that reads "legacy" files, and puts the data into whatever objects you feel necessary. With serialization, it is very difficult to do, you would have to either keep around a set of old classes that do nothing but the serialization, or look up the internal binary format and pray that you read in your stuff correctly.
If you do it right, you don't have to worry about it. But writing a routine to populate your classes with data completely defeats the purpose of object oriented design.
Navin wrote:
(Stand-alone conversion apps are pains in the butt, btw.)
I usually don't write them as separate applications. I write a few classes to do the conversion and wrap them in an ActiveX control. That way I don't have to recompile it to use it in separate applications.
Navin wrote:
My point being - if you don't think your program will ever change significantly, or backwards/forwards compatibility is not an issue, then object oriented MFC serialization is fine. Otherwise, use something more flexible.
And mine is that all of the ways you can possibly store data have their pros and cons. Deciding which to choose is sometimes a matter of taste or preference, but more than likely, you can use any of them on any given project/product.
Zac
"If I create everything new, why would I want to delete anything?"
|
|
|
|
|
Navin wrote:
Forward compatiblity. Many people may not care, but serializing in this way essentially changes your file format every time the objects change.
You become locked into an object structure. Changing or adding members here and there works, but what if something major happens - e.g., a base class changes? Or you find you have two classes that didn't derive from a base class before, but now you want to break out a base class?
If you have these problems, you really need to evaluate your design process. If you are versioning your objects correctly, you will not run into problem #1. Problem #2 comes with a bad (actually, horrible) design process.
Zac
"If I create everything new, why would I want to delete anything?"
|
|
|
|
|
Not quite. Problem 2 can come very easily with serialzation. Serialization is what locks you into an object strucutre, since it is pretty much impossible to de-serialize *without* using the same object strucutre. The fact of life is that programs change, requirement change, and your object structure is bound to change as well. Some of the *worst* designs I've seen is when somebody fails to realize you should break out a base class or change your class structure, and tries to make classes do tons of things they weren't originally designed to do.
As mentioned previously, if forward/backwards compatibility are requirements, a more dynamic method, such as XML, or even a plain INI file, is better suited.
You can pick your friends, and you can pick your nose, but you can't pick your friend's nose.
|
|
|
|
|
Navin wrote:
Not quite. Problem 2 can come very easily with serialzation. Serialization is what locks you into an object strucutre, since it is pretty much impossible to de-serialize *without* using the same object strucutre. The fact of life is that programs change, requirement change, and your object structure is bound to change as well. Some of the *worst* designs I've seen is when somebody fails to realize you should break out a base class or change your class structure, and tries to make classes do tons of things they weren't originally designed to do.
No more than it can with any other method for saving data. If your design for an application does not allow you to change the object structure without a huge headache, you didn't spend enough time on design (which, in the "real-world" may not have been your fault -- you could have inherited the code, but the cause of the problem is still the same). When you implement a serialization scheme, you have to keep in mind that if you are going to change the objects, you MUST keep the original loading sequence in tact and make modifications to a copy of it:
<br />
void CMyObject::Serialize(CArchive& ar)<br />
{<br />
if (ar.IsStoring())<br />
{<br />
ar << data1 << data2;<br />
}<br />
else
{<br />
switch(ar.GetObjectSchema())<br />
{<br />
case 1:
ar >> data1;<br />
case 2:
ar >> data1 >> data2;<br />
default:<br />
ASSERT(FALSE | "Problem with serialization!");<br />
break<br />
}<br />
}<br />
}<br />
If you are changing the class frequently, it is time to look at a more flexible way of storing data. However, if you only need to make changes for major releases, this is acceptable in most cases.
I don't disagree that XML and INI are more flexible; however, properly implemented, Serialization does not suffer from the problems you described (it has other issues, as do XML and INI).
Zac
"If I create everything new, why would I want to delete anything?"
|
|
|
|
|