Click here to Skip to main content
15,881,882 members
Articles / Programming Languages / C#
Article

Generating MD5 Hash out of C# Objects

Rate me:
Please Sign up or sign in to vote.
4.56/5 (14 votes)
25 Jul 2008Apache5 min read 228.7K   4.2K   44   29
This article describes how to generate the MD5 hash string for a common C# object.

Introduction

Sometimes we have to serialize objects, e.g. to send them over a network, store and restore them locally or for any other reason. Now it can be useful if we would know after the deserializing process, if the object has been restored correctly. Especially if you have objects which have internal states or if you must manage multiple instances of a class. A possible solution to this problem is using the System.Guid struct to identify the objects. But in this way, you cannot be sure that the internal states, etc. were deserialized correctly (see Background for explanation).

A commonly used technique in the Internet is to provide a MD5 - Hash String so the receiver can compare if the file has been transmitted without any modifications.

Background

The .NET Framework gives us a struct to uniquely identify our objects, the System.Guid struct in the mscorlib.dll. This struct can be used to give each class its own identifier. And that's the crux of the matter. What we need is not an identifier for the class, we need an identifier for each instance of the class. Implicitly this identifier must also represent some internal values like state. Otherwise our recipient of the object cannot be sure, that he has received / deserialized the same object. Also our recipient cannot "create" a GUID on his own. Once it is created by the sender, it is not reproducible.

We must also provide a functionality, which can be executed by both, sender and recipient, to identify an object. This identifier must also implicit regard on the fields which are relevant for this object. And these relevant fields can be different for each class!

The idea I had was to use MD5 hashes for that. Each object has a built-in function called .GetHashCode(). This method returns an Integer, although according to the name of the method, you would expect a string. That's because these HashValues are intended to be used as Keys in e.g. a HashTable.

But fortunately, there exists a class named MD5CryptoServiceProvider in the System.Security.Cryptography namespace. Unfortunately, this class is not easy to use. The main problem for most programmers could be that the class only accepts a byte-array as input and not a reference to an object. So I decided to wrap all the needed functionality into a generator class. This class could then generate the Hash for me, and I have to write just one line of code.

Using the Code

The codefile above contains a class called MD5HashGenerator. This class has a static method .generateKey(Object sourceObject), which does the "magic" for you. Include the class into your project, and use it as follows:

To use the class (as a publisher), you have to do the following things:

  1. Mark the object as Serializable(). Mark all variables which should not be serialized as NonSerializable().
  2. Call the static method MD5HashGenerator.generateKey(Object sourceObject). You get the MD5 - Hash for the object as a String.
  3. Serialize the object, publish / store it and the hash.

If you are the receiver, then:

  1. Deserialize the received object.
  2. Call the static method MD5HashGenerator.generateKey(Object sourceObject) on the deserialized object.
  3. Compare the hashes.

Example

We want to serialize a class which has a string, an int and a DateTime. The dateTime member is set at creation time, so it is different for each instance of the class. As mentioned above, the class must be tagged as serializable. It (could) look like this:

C#
using System;
using System.Runtime.Serialization;

[Serializable]
public class SimpleClass
{
  private string justAString;
  private int justAnInt;
  private DateTime justATime;

  /// <summary>
  /// Default Constructor. The fields are filled with some standard values.
  /// </summary>
  public SimpleClass()
  {
    justAString = "Some useless text";
    justAnInt = 345678912;
    justATime = DateTime.Now;
  }
}

Because we use the system method DateTime.Now to initialize the field justATime, each instance of the class should be different. It is important to "mark" the class as Serializable, because this is asked by the MD5HashGenerator-class.

The generator class uses the BinaryFormatter for serialization, so all fields (whether they are private or not are automatically included in the serialization process). But exclude handles and pointers, if you are using them. See [1] for details.

The class which "publishes" the object must then do the following things:

C#
...
    SimpleObject simpleObject = new SimpleObject();
    string simpleObjectHash = MD5HashGenerator.generateKey(simpleObject);
    //Now serialize the simpleObject e.g. with a XmlSerializer and 
    //store the hash somewhere
...

Now the "consumer" can deserialize the SimpleObject and also call MD5HashGenerator.generateKey(simpleObject) on the deserialized object. He can then compare the hashstrings and decide if it's the same object.

How It Works

The code of the MD5HashGenerator.generateKey(Object SourceObject) method looks like this:

C#
public static String GenerateKey(Object sourceObject)
    {
        String hashString;

        //Catch unuseful parameter values
        if (sourceObject == null)
        {
            throw new ArgumentNullException("Null as parameter is not allowed");
        }
        else
        {
            //We determine if the passed object is really serializable.
            try
            {
                //Now we begin to do the real work.
                hashString = ComputeHash(ObjectToByteArray(sourceObject));
                return hashString;
            }
            catch (AmbiguousMatchException ame)
            {
                throw new ApplicationException("Could not definitely decide 
			if object is serializable. Message:"+ame.Message);
            }
        }
    }

Let's have a deeper look at the following line of code:

C#
hashString = ComputeHash(ObjectToByteArray(sourceObject));

As mentioned above I used the MD5CryptoServiceProvider class to generate the Hashstring. I encapsulated the use of the method in the ComputeHash(byte[] objectAsBytes) method. Here's the implementation:

C#
private static string ComputeHash(byte[] objectAsBytes)
    {
        MD5 md5 = new MD5CryptoServiceProvider();
        try
        {
            byte[] result = md5.ComputeHash(objectAsBytes);

            // Build the final string by converting each byte
            // into hex and appending it to a StringBuilder
            StringBuilder sb = new StringBuilder();
            for (int i = 0; i < result.Length; i++)
            {
                sb.Append(result[i].ToString("X2"));
            }

            // And return it
            return sb.ToString();
        }
        catch (ArgumentNullException ane)
        {
            //If something occurred during serialization, 
            //this method is called with a null argument. 
            Console.WriteLine("Hash has not been generated.");
            return null;
        }

As you can see, the MD5CryptoServiceProvider class wants a byte array as input. It does not accept an object directly. What you get out of it is not a string as we would like to have, but a byte array. Therefore I added the conversion from byte array to Hex. The conversion is done by using the Byte.ToString() method. The method accepts a formatstring as input. And "X2" here means that each byte is converted into a two-char-string-sequence (e.g. 01011100 => 5C or 00000111 => 07).

Now there is still the question as to how to convert an object into a byte array. We know that our object is serializable. So we can serialize it into the memory (using a MemoryStream and a BinaryFormatter) and getting out of the memory the needed byte array. Because the whole thing should be thread-safe, we lock the Serialization of the object.

C#
private static readonly Object locker = new Object();

private static byte[] ObjectToByteArray(Object objectToSerialize)
    {
        MemoryStream fs = new MemoryStream();
        BinaryFormatter formatter = new BinaryFormatter();
        try
        {
            //Here's the core functionality! One Line!
            //To be thread-safe we lock the object
            lock (locker)
            {
                formatter.Serialize(fs, objectToSerialize);
            }
            return fs.ToArray();
        }
        catch (SerializationException se)
        {
            Console.WriteLine("Error occurred during serialization. Message: " +
			se.Message);
            return null;
        }
        finally
        {
            fs.Close();
        }
    }

Conclusion

Generating MD5-hashes can be useful, if you must have a procedure both sides can execute to ensure the uniqueness and changeless serialization / deserialization of objects. The most difficult part for me was to convert an object into a byte array and the conversion of a byte array to an Hex - String. Using Guids is also a possibility. But the Guid is created when the object is initialized and the consumer cannot "recreate" the Guid to ensure that no changes on the object were done. He just knows that he has received the same object the producer has created.

What I didn't do is all the security issues. Using only MD5 Hashes is not reliable enough. If you need strong security, provide RSA - encrypted channels or other encryption methods.

References

History

  • V1.2 -- 28.07.2008 -- Refactored the article, after some discussions
  • V1.1 -- 25.07.2008 -- Added some modifications according to the post of Adam Tibi
  • V1.0 -- 15.11.2007 -- First version of article

License

This article, along with any associated source code and files, is licensed under The Apache License, Version 2.0


Written By
Business Analyst Die Mobiliar
Switzerland Switzerland
Bachelor in Computer Science.
Works as a Software Engineer (C#) for Mettler Toledo in Switzerland.
Develops software for formulation and SQC.

Comments and Discussions

 
QuestionMigrating to .NET 8 Pin
Vsevolod Mozgovoi9-Jan-24 1:12
Vsevolod Mozgovoi9-Jan-24 1:12 
BugDoesn't work for big objects. Pin
Gor.Solomon12-Aug-19 4:04
Gor.Solomon12-Aug-19 4:04 
BugNot Always work Pin
stukdev16-Sep-15 4:18
stukdev16-Sep-15 4:18 
GeneralMis-Understood Pin
Terence Wallace21-Nov-09 13:34
Terence Wallace21-Nov-09 13:34 
GeneralRe: Mis-Understood Pin
Hasler Thomas23-Nov-09 9:54
professionalHasler Thomas23-Nov-09 9:54 
GeneralRe: Mis-Understood Pin
Terence Wallace23-Nov-09 10:22
Terence Wallace23-Nov-09 10:22 
GeneralNot the perfect code... [modified] Pin
Adam Tibi24-Jul-08 4:14
professionalAdam Tibi24-Jul-08 4:14 
GeneralRe: Not the perfect code... Pin
Hasler Thomas25-Jul-08 3:01
professionalHasler Thomas25-Jul-08 3:01 
GeneralRe: Not the perfect code... Pin
Adam Tibi25-Jul-08 3:50
professionalAdam Tibi25-Jul-08 3:50 
GeneralRe: Not the perfect code... Pin
Hasler Thomas25-Jul-08 4:19
professionalHasler Thomas25-Jul-08 4:19 
GeneralRe: Not the perfect code... Pin
Adam Tibi25-Jul-08 4:35
professionalAdam Tibi25-Jul-08 4:35 
GeneralRe: Not the perfect code... Pin
evolved25-Jul-08 9:06
evolved25-Jul-08 9:06 
GeneralRe: Not the perfect code... Pin
Adam Tibi27-Jul-08 23:31
professionalAdam Tibi27-Jul-08 23:31 
GeneralRe: Not the perfect code... Pin
Hasler Thomas28-Jul-08 1:19
professionalHasler Thomas28-Jul-08 1:19 
GeneralRe: Not the perfect code... Pin
Adam Tibi28-Jul-08 3:06
professionalAdam Tibi28-Jul-08 3:06 
GeneralRe: Not the perfect code... Pin
evolved28-Jul-08 4:36
evolved28-Jul-08 4:36 
GeneralRe: Not the perfect code... Pin
Adam Tibi28-Jul-08 4:58
professionalAdam Tibi28-Jul-08 4:58 
GeneralRe: Not the perfect code... Pin
Hasler Thomas28-Jul-08 1:58
professionalHasler Thomas28-Jul-08 1:58 
GeneralRe: Not the perfect code... Pin
evolved28-Jul-08 5:48
evolved28-Jul-08 5:48 
GeneralRe: Not the perfect code... Pin
Hasler Thomas28-Jul-08 6:11
professionalHasler Thomas28-Jul-08 6:11 
I've done some rework on the article. Basically I cut the ISerialization - thing and the custom serialization. Hope the whole thing improved a little bit.

It's still not yet "The perfect code". Probably it will never be.

Regards

Thomas
GeneralRe: Not the perfect code... Pin
Adam Tibi28-Jul-08 23:28
professionalAdam Tibi28-Jul-08 23:28 
GeneralRe: Not the perfect code... Pin
Hasler Thomas29-Jul-08 0:39
professionalHasler Thomas29-Jul-08 0:39 
GeneralRe: Not the perfect code... Pin
milosz skalecki19-Oct-09 22:42
milosz skalecki19-Oct-09 22:42 
GeneralRe: Not the perfect code... Pin
Hasler Thomas23-Nov-09 10:15
professionalHasler Thomas23-Nov-09 10:15 
QuestionAJAX? Pin
HoyaSaxa9315-Nov-07 4:43
HoyaSaxa9315-Nov-07 4:43 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.