Click here to Skip to main content
Click here to Skip to main content
Go to top

A Fast Serialization Technique

, 19 May 2006
Rate this:
Please Sign up or sign in to vote.
Transparently boosting serialization performance and shrinking the serialized object's size.

Introduction

Serialization is everywhere in .NET. Every parameter you pass to or from a remoted object, web service, or WCF service gets serialized at one end and deserialized at the other. So why write about fast serialization? Surely, the standard BinaryFormatter and SoapFormatter are pretty quick, aren't they?

Well, no. When passing a reasonably substantial object from one process to another using Remoting, we find that performance was topping out at 300 calls per second. Investigation showed that each serialization/deserialization cycle was taking 360 microseconds, which would be fine except that 300 per second means that 11% of the CPU is being consumed by the serialization alone!

Background

Some form of custom serialization would be an option. An object knows exactly what types of what fields it wants to serialize. It doesn't need all the general purpose overheads and Reflection to work this out and extract the data - it can do it all by itself, much more efficiently. The result is generally much more compact. There is an example in .Shoaib's article, which demonstrates these benefits.

The problem with custom serialization is that the interface is different, requiring the calling code to be changed. It also doesn't help the automated serialization in .NET's remote access mechanisms, unless you manually serialize to a byte array and then pass this as a parameter. This isn't very type-safe!

What I cover below is a simple way to retain the benefits of custom serialization, while retaining the standard serialization interface and all the benefits that confers.

Using the code

As is often the case in matters of complex serialization, the solution lies in implementing the ISerializable interface (see here for a primer). Here's a much simplified version of the object we are using:

[Serializable]
public class TestObject : ISerializable {
  public long     id1;
  public long     id2;
  public long     id3;
  public string   s1;
  public string   s2;
  public string   s3;
  public string   s4;
  public DateTime dt1;
  public DateTime dt2;
  public bool     b1;
  public bool     b2;
  public bool     b3;
  public byte     e1;
  public IDictionary<string,object> d1;
}

To serialize an object, ISerializable requires us to implement GetObjectData to define the set of data to be serialized. The trick here is to use custom serialization to merge all the fields into a single buffer, then to add this buffer to the SerializationInfo parameter to be serialized by the standard formatters. This is how it's done:

// Serialize the object. Write each field to the SerializationWriter
// then add this to the SerializationInfo parameter

public void GetObjectData (SerializationInfo info, StreamingContext ctxt) {
  SerializationWriter sw = SerializationWriter.GetWriter ();
  sw.Write (id1);
  sw.Write (id2);
  sw.Write (id3);
  sw.Write (s1);
  sw.Write (s2);
  sw.Write (s3);
  sw.Write (s4);
  sw.Write (dt1);
  sw.Write (dt2);
  sw.Write (b1);
  sw.Write (b2);
  sw.Write (b3);
  sw.Write (e1);
  sw.Write<string,object> (d1);
  sw.AddToInfo (info);
}

The SerializationWriter class extends BinaryWriter to add support for additional data types (DateTime and Dictionary) and to simplify the interface to SerializationInfo. It also overrides BinaryWriter's Write(string) method to allow for null strings. I won't go into the implementation detail here. There is lots of explanation in the code for those who are interested.

ISerializable also requires us to define a constructor to deserialize a stream to a new object. The process here is just as simple as that above:

// Deserialization constructor. Create a SerializationReader from
// the SerializationInfo then extract each field from it in turn.

public TestObject (SerializationInfo info, StreamingContext ctxt) {
  SerializationReader sr = SerializationReader.GetReader (info);
  id1 = sr.ReadInt64 ();
  id2 = sr.ReadInt64 ();
  id3 = sr.ReadInt64 ();
  s1  = sr.ReadString ();
  s2  = sr.ReadString ();
  s3  = sr.ReadString ();
  s4  = sr.ReadString ();
  dt1 = sr.ReadDateTime ();
  dt2 = sr.ReadDateTime ();
  b1  = sr.ReadBoolean ();
  b2  = sr.ReadBoolean ();
  b3  = sr.ReadBoolean ();
  e1  = sr.ReadByte ();
  d1  = sr.ReadDictionary<string,object> ();
}

Similarly, SerializationReader extends BinaryReader for the same reasons as above.

Over time, I'll probably be extending the set of types which the writer and reader can handle efficiently. There are already the WriteObject() and ReadObject() methods which will write any arbitrary type, but this just falls back to standard binary serialization (unless it's one of the supported fast types).

Results

The test program included in the download simply creates and populates the TestObject, and times its serialization and deserialization, in microseconds per cycle, averaged over 250K cycles. All timings are done on a 1.5GHz Pentium M laptop. The results are:

  Formatter Size (bytes) Time (uS)
Standard serialization Binary 2080 364
Fast serialization Binary 421 74
Fast serialization SOAP 1086 308

So, the fast serialization technique below can cut both the size and serialization-deserialization time to about a fifth of the out-of-the box serialization. Even SOAP serialization (normally 2 to 3 times slower than binary) is faster than the standard binary serialization.

Summary

Combining custom serialization with ISerializable in this way delivers major performance gains without any change to the handling of the objects in question. It allows fast serialization to be transparently added to specific objects where a performance issue has been identified.

In our own case, throughput increased from 300 Remoting calls per second to over 700, just by changing this for one key object. No other changes were necessary.

There is also one other unexpected benefit from this. You'll notice that there are no comparative figures above for the SoapFormatter, which is because MS has not equipped the SoapFormatter to handle generic types. Using the technique above means that the SoapFormatter never sees the generic type which has been custom serialized to a byte array, so this restriction is removed.

Combining custom serialization with ISerializable is never going to be as fast as pure custom serialization alone. However, the added benefit of remaining within the standard serialization framework makes this a useful technique for boosting performance without impacting other code.

History

  • First version - 19 May 2005.

This is my first post on CodeProject - so please be gentle!

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

Share

About the Author

Tim Haynes
Web Developer
United Kingdom United Kingdom
No Biography provided

Comments and Discussions

 
QuestionLicensing question PinmemberMember 109324338-Jul-14 2:55 
QuestionPossible memory leak PinmemberMenelaos Vergis16-May-13 22:07 
QuestionBenefits of SerializationWriter & SerializationReader VS normal ISerializable? PinmemberMember 39199258-Dec-10 11:22 
GeneralSerializationBinder with FastSerializer PinmemberDon Klayman7-Jun-10 2:47 
Generalquestion PinmemberKelvin Lu3-Feb-10 11:19 
GeneralGoogle Protocol Buffers - very simple and fast serialization/deserialization Pinmembergoyello30-Sep-09 3:51 
GeneralProblem with infinite loops Pinmembermt991-Sep-09 3:55 
GeneralRe: Problem with infinite loops PinmemberRichardM118-Feb-10 17:15 
GeneralNot sure if you noticed this PinmemberParag.Gadkari23-Feb-09 11:38 
Generalan improved version with index from http://codebetter.com/blogs/gregyoung/archive/2008/08/24/fast-serialization.aspx PinmemberUnruled Boy4-Oct-08 4:15 
GeneralGreat! Another performance suggestion Pinmemberrdehar22-Sep-08 9:35 
It is a great solution!
 
Also, I have a suggestion: I am not sure about the performance of switch(string), but it is probably not the best. So, is it possible to provide overloaded methods that accept strongly typed reference/value types. Well, at least some, like int, char, bool, long are the most common ones used:
 
public void Write(int i) {
Write ((byte)ObjType.int32Type);
Write (i);
}
 
This way the switch(string) and null check are both avoided.
 
Overall, great!!
GeneralObject Graphs Pinmemberstephenpatten28-Aug-08 7:48 
GeneralRe: Object Graphs Pinmemberstephenpatten28-Aug-08 7:53 
GeneralRe: Object Graphs Pinmemberstephenpatten28-Aug-08 8:40 
GeneralRe: Object Graphs PinmemberTim Haynes28-Aug-08 9:47 
QuestionHow to solve BinarySerialize Pinmemberdung18-Jun-08 16:44 
GeneralSerializing multi-dimensional stuff PinmemberInsolence12-Mar-08 16:39 
GeneralRe: Serializing multi-dimensional stuff PinmemberTim Haynes18-Mar-08 4:43 
QuestionInteroperability between .NET and Java binary serialization PinmemberFuego0626-Nov-06 22:32 
AnswerRe: Interoperability between .NET and Java binary serialization PinmemberCode Monkey27-Nov-06 3:40 
GeneralRe: Interoperability between .NET and Java binary serialization PinmemberFuego0627-Nov-06 8:38 
GeneralRe: Interoperability between .NET and Java binary serialization PinmemberJon Rista27-Nov-06 11:17 
GeneralRe: Interoperability between .NET and Java binary serialization PinmemberCode Monkey28-Nov-06 0:33 
AnswerRe: Interoperability between .NET and Java binary serialization PinmemberEnnis Ray Lynch, Jr.5-Jan-07 7:47 
AnswerRe: Interoperability between .NET and Java binary serialization PinmemberMehdi Mousavi20-May-07 3:50 
QuestionSupport for DataSet? PinmemberMichael B. Hansen22-Nov-06 1:17 
GeneralEven Better... Pinmemberlorekd2-Nov-06 10:25 
QuestionInherited classes? Pinmemberiwasiunknown12-Oct-06 16:22 
AnswerRe: Inherited classes? PinmemberTim Haynes15-Oct-06 6:03 
GeneralImprove performance Pinmembernuri h21-Aug-06 20:03 
GeneralFurther optimizations... PinmemberSimmoTech8-Aug-06 3:41 
GeneralWrite( byte[] b ): Length vs LongLength PinmemberSchlups12-Jul-06 22:40 
GeneralA little "error handling" Pinmemberrune@vrk.dk6-Jul-06 23:02 
GeneralCode to support Guid Pinmemberkgbroce16-Jun-06 9:24 
GeneralRe: Code to support Guid PinmemberRam Cronus3-Jun-08 6:16 
Generalbyte[] array gets skipped Pinmemberkgbroce16-Jun-06 9:14 
GeneralRe: byte[] array gets skipped PinmemberTim Haynes18-Jun-06 10:12 
GeneralUse TypeCode for objType Pinmemberzuken215-Jun-06 18:08 
GeneralRe: Use TypeCode for objType PinmemberTim Haynes8-Jun-06 22:07 
GeneralNullable Types Pinmemberjmueller30-May-06 5:48 
GeneralRe: Nullable Types PinmemberTim Haynes8-Jun-06 22:23 
GeneralGreat solution PinmemberScottEllisNovatex23-May-06 16:19 
GeneralRe: Great solution PinmemberFrank Stegerwald23-May-06 20:51 
GeneralRe: Great solution PinmemberScottEllisNovatex24-May-06 21:03 
GeneralRe: Great solution PinmemberFrank Stegerwald24-May-06 22:59 
JokeI like it ! Pinmemberkikos3120-May-06 0:39 
GeneralRe: I like it ! Pinmembertkrafael_net21-May-06 5:31 
GeneralGently: Wrong section PinmemberJerry Evans19-May-06 10:07 
GeneralRe: Gently: Wrong section PinmemberTim Haynes21-May-06 22:50 
GeneralI prefer the standard Pinmembersirlantis19-May-06 9:37 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Mobile
Web03 | 2.8.140916.1 | Last Updated 19 May 2006
Article Copyright 2006 by Tim Haynes
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid