Sometimes we have to serialize objects, e.g. to send them over a network, store and restore them locally or for any other reason. Now it can be useful if we would know after the deserializing process, if the object has been restored correctly. Especially if you have objects which have internal states or if you must manage multiple instances of a class. A possible solution to this problem is using the
System.Guid struct to identify the objects. But in this way, you cannot be sure that the internal states, etc. were deserialized correctly (see Background for explanation).
A commonly used technique in the Internet is to provide a MD5 - Hash String so the receiver can compare if the file has been transmitted without any modifications.
The .NET Framework gives us a struct to uniquely identify our objects, the
System.Guid struct in the mscorlib.dll. This struct can be used to give each class its own identifier. And that's the crux of the matter. What we need is not an identifier for the class, we need an identifier for each instance of the class. Implicitly this identifier must also represent some internal values like state. Otherwise our recipient of the object cannot be sure, that he has received / deserialized the same object. Also our recipient cannot "create" a GUID on his own. Once it is created by the sender, it is not reproducible.
We must also provide a functionality, which can be executed by both, sender and recipient, to identify an object. This identifier must also implicit regard on the fields which are relevant for this object. And these relevant fields can be different for each class!
The idea I had was to use MD5 hashes for that. Each object has a built-in function called
.GetHashCode(). This method returns an
Integer, although according to the name of the method, you would expect a
string. That's because these
HashValues are intended to be used as Keys in e.g. a
But fortunately, there exists a class named
MD5CryptoServiceProvider in the
System.Security.Cryptography namespace. Unfortunately, this class is not easy to use. The main problem for most programmers could be that the class only accepts a byte-array as input and not a reference to an object. So I decided to wrap all the needed functionality into a generator class. This class could then generate the Hash for me, and I have to write just one line of code.
Using the Code
The codefile above contains a class called
MD5HashGenerator. This class has a
.generateKey(Object sourceObject), which does the "magic" for you. Include the class into your project, and use it as follows:
To use the class (as a publisher), you have to do the following things:
- Mark the object as
Serializable(). Mark all variables which should not be serialized as
- Call the
MD5HashGenerator.generateKey(Object sourceObject). You get the MD5 - Hash for the object as a
- Serialize the object, publish / store it and the hash.
If you are the receiver, then:
- Deserialize the received object.
- Call the
MD5HashGenerator.generateKey(Object sourceObject) on the deserialized object.
- Compare the hashes.
We want to serialize a class which has a
int and a
dateTime member is set at creation time, so it is different for each instance of the class. As mentioned above, the class must be tagged as serializable. It (could) look like this:
public class SimpleClass
private string justAString;
private int justAnInt;
private DateTime justATime;
justAString = "Some useless text";
justAnInt = 345678912;
justATime = DateTime.Now;
Because we use the system method
DateTime.Now to initialize the field
justATime, each instance of the class should be different. It is important to "mark" the class as
Serializable, because this is asked by the
The generator class uses the
BinaryFormatter for serialization, so all fields (whether they are
private or not are automatically included in the serialization process). But exclude handles and pointers, if you are using them. See  for details.
The class which "publishes" the object must then do the following things:
SimpleObject simpleObject = new SimpleObject();
string simpleObjectHash = MD5HashGenerator.generateKey(simpleObject);
Now the "consumer" can deserialize the
SimpleObject and also call
MD5HashGenerator.generateKey(simpleObject) on the deserialized object. He can then compare the hashstrings and decide if it's the same object.
How It Works
The code of the
MD5HashGenerator.generateKey(Object SourceObject) method looks like this:
public static String GenerateKey(Object sourceObject)
if (sourceObject == null)
throw new ArgumentNullException("Null as parameter is not allowed");
hashString = ComputeHash(ObjectToByteArray(sourceObject));
catch (AmbiguousMatchException ame)
throw new ApplicationException("Could not definitely decide
if object is serializable. Message:"+ame.Message);
Let's have a deeper look at the following line of code:
hashString = ComputeHash(ObjectToByteArray(sourceObject));
As mentioned above I used the
MD5CryptoServiceProvider class to generate the
Hashstring. I encapsulated the use of the method in the
ComputeHash(byte objectAsBytes) method. Here's the implementation:
private static string ComputeHash(byte objectAsBytes)
MD5 md5 = new MD5CryptoServiceProvider();
byte result = md5.ComputeHash(objectAsBytes);
StringBuilder sb = new StringBuilder();
for (int i = 0; i < result.Length; i++)
catch (ArgumentNullException ane)
Console.WriteLine("Hash has not been generated.");
As you can see, the
MD5CryptoServiceProvider class wants a
byte array as input. It does not accept an object directly. What you get out of it is not a
string as we would like to have, but a
byte array. Therefore I added the conversion from
byte array to Hex. The conversion is done by using the
Byte.ToString() method. The method accepts a formatstring as input. And "
X2" here means that each byte is converted into a two-char-string-sequence (e.g. 01011100 => 5C or 00000111 => 07).
Now there is still the question as to how to convert an object into a
byte array. We know that our object is serializable. So we can serialize it into the memory (using a
MemoryStream and a
BinaryFormatter) and getting out of the memory the needed
byte array. Because the whole thing should be thread-safe, we lock the
Serialization of the object.
private static readonly Object locker = new Object();
private static byte ObjectToByteArray(Object objectToSerialize)
MemoryStream fs = new MemoryStream();
BinaryFormatter formatter = new BinaryFormatter();
catch (SerializationException se)
Console.WriteLine("Error occurred during serialization. Message: " +
Generating MD5-hashes can be useful, if you must have a procedure both sides can execute to ensure the uniqueness and changeless serialization / deserialization of objects. The most difficult part for me was to convert an object into a
byte array and the conversion of a
byte array to an Hex -
String. Using Guids is also a possibility. But the Guid is created when the object is initialized and the consumer cannot "recreate" the Guid to ensure that no changes on the object were done. He just knows that he has received the same object the producer has created.
What I didn't do is all the security issues. Using only MD5 Hashes is not reliable enough. If you need strong security, provide RSA - encrypted channels or other encryption methods.
- V1.2 -- 28.07.2008 -- Refactored the article, after some discussions
- V1.1 -- 25.07.2008 -- Added some modifications according to the post of Adam Tibi
- V1.0 -- 15.11.2007 -- First version of article
Bachelor in Computer Science.
Works as a Software Engineer (C#) for Mettler Toledo in Switzerland.
Develops software for formulation and SQC.