Click here to Skip to main content
15,889,034 members
Articles / Programming Languages / C#

Bend the .NET Object to your Will!

Rate me:
Please Sign up or sign in to vote.
4.83/5 (25 votes)
14 Jan 2009CPOL4 min read 46.9K   123   70   19
Clone, serialize, and deep-compare any .NET object, regardless of type

Introduction

Have you ever had to implement ICloneable on a complex type? Gets out of hand in a hurry, doesn't it? How about IEquatable<T>? Here's a good one: what happens when you need to serialize an object graph using BinaryFormatter (so it can be transmitted or stored) and somewhere in the tree there's a type you don't control that isn't serializable? XML to the rescue, right? But when you punt the object over to the XmlSerializer, there are read-only properties you don't control that aren't participating. Now what? Create your own surrogate type and handle the marshalling operations in some utility? Sounds like a pain in the butt to me. Which is why I decided to do it one more time, and then never again. :)

In order to clone an object, what do you really need? Ultimately, all you need is the structure of the object, and its simple values. If you know those two things, a new copy of the object can be constructed.

What about deep comparison between objects? Same thing. If an object's structure and each of its simple values equal another object's, then those objects are value-equivalent.

And wouldn't you know it, the process of serializing an unknown type requires that we store the structure of the object and its simple (implicitly serializable) values in a new structure that can be serialized.

Since all three features depend on the same thing happening to your objects, all of the extension methods delivering these features depend on the same class : ObjectGraph.

Background

This article focuses on a few small extension methods that all make use of a new class called ObjectGraph. This class decomposes objects down to their simplest values while maintaining member association. This enables objects to be analyzed and manipulated in fine-grained ways, regardless of type.

This article makes use of .NET Framework 3.5.

Using the Code

The code is extremely easy to use:

C#
var instance = new ComplexType // this object could be anything at all
{
    Id = 47,
    Name = "My Complex Type",
    ArbitraryValue = ArbitraryEnum.Foo,
    Values = new List<string>(new[]{"Value1", "Value2", "Value3"})
};
 
// extension method:  Clone
ComplexType clone = instance.Clone(); // a true deep copy
 
// extension method:  ToBinaryString
string serializedInstance = instance.ToBinaryString(); // a base-64 encoded byte array 
 
// extension method:  ToObject<T>
var deserializedInstance = serializedInstance.ToObject<ComplexType>(); // another clone! 
 
// extension method : ValueEquals
bool isCloneEqual = instance.ValueEquals(clone); // true
bool isRoundTripEqual = instance.ValueEquals(deserializedInstance); // also true :)

How It Works

The biggest convention breaker here is the idea of being able to serialize any object using the BinaryFormatter, even ones that aren't decorated with [Serializable]. It's a simple trick: the object being serialized isn't your object. It's actually a wrapper class (ObjectGraph) that is 100% serializable, and stores enough information to completely rehydrate your object after being deserialized.

When ObjectGraph wraps an object, several things may take place, depending on the object being wrapped. If the wrapped object is a simple type, i.e. one that the code recognizes as being directly serializable, then the raw value of the object is stored and the wrapping operation is complete. If the object has already been wrapped in the current graph, a pointer to the original wrapper is stored. If the object is an array of other objects, then the array items are individually wrapped and stored. If the object is a complex type, then each of its member variables are wrapped and stored in a name-keyed dictionary.

Why member variables? This is the key. No matter what the public interface of your class, if the class holds state information at all it will be in a member variable. Automatic properties get their variables generated for them, but it's all the same. Once I have the value of all of an object's variables, I can use Reflection-based instantiation to create an exact copy of the object, or compare them to any other object with a matching type.

Most of ObjectGraph's code loses its meaning if you try to read individual methods out of context, so I apologize if this doesn't make enough sense, but here's the private ObjectGraph constructor; it should give some clue as to how the ObjectGraph analyzes the object it wraps.

C#
private ObjectGraph(object data, GraphRegistry registry, bool isRootGraph)
{
	// make sure to unhook all pointers created during scan
	using (new DisposableContext(() => { if (isRootGraph) registry.Clear(); }))
	{
		_isValueBased = data.IsValueBased();
 
		if (_isValueBased) _value = data;
		else
		{
			_pointer = registry.Register(data, this);
 
			if (_pointer == null)
			{
				_isArray = data is Array;
 
				if (_isArray)
				{
					_arrayItems = GetItems((Array) data,
                                                 registry).ToList();
 
					// CLR gens type names for arrays using the
					// {itemTypeName}[{length}] syntax.
					_type = Regex.Replace(
                                                                    data.GetType().AssemblyQualifiedName,
					                      @"\[\d*\]", string.Empty);
 
					return;
				}
 
				_state = GetValues(data, registry);
			}
		}
 
		_type = data != null ? data.GetType().AssemblyQualifiedName : string.Empty;
	}
}

Points of Interest

Check out the unit tests in the source. They show that in addition to CLR types & your custom types, anonymous types will also participate happily with the ObjectGraph class.

Speaking of which, the unit tests included in the source are not really unit tests; they are integration tests with BDD naming semantics, all of which is completely improper. The only reason they are present is so that I (and you) can quickly debug the code. Please do not think that this article is attempting to address the proper way to implement TDD or BDD. In fact, here's a disclaimer: THIS ARTICLE DEMONSTRATES POOR TESTING HABITS.

Also, since indirect recursion is used in both object scanning and rehydration, I have concerns that a graph of sufficient depth could cause a StackOverflowException to occur. I have not been able to make this happen in practical use, so it may be okay for most scenarios. Fair warning.

Finally, I would like to thank the members who quickly responded with some critical feedback that led to this component's current status. Your input is much appreciated!

Enjoy :)

History

  • 12/19/08: Submitted first draft
  • 12/20/08: Submitted second draft & code revision for cyclic reference support
  • 1/11/09: Submitted final draft

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)


Written By
Architect
United States United States
I'm a professional developer with over 9 years of experience in advanced C# development. I've worked extensively with every phase of the SDLC and have developed and deployed many enterprise solutions using the latest .Net technologies. Please let me know if you have any suggestions or questions.

Comments and Discussions

 
GeneralSomewhat presumptuous to assume such cloning/comparison will work Pin
supercat916-Jan-09 7:59
supercat916-Jan-09 7:59 
GeneralRe: Somewhat presumptuous to assume such cloning/comparison will work Pin
John Batte4-Jun-09 8:59
John Batte4-Jun-09 8:59 
First of all, I apologize for taking so long to reply. My day job and my girlfriend keep me very busy Big Grin | :-D

What you're describing is a business object, in the middle of a business process. It could be small and algorithmic in nature, or it could be large and transactional in nature. Either way, the process of cloning or serializing does not concern business logic, and thus should not impact the logic you've defined. Just remember that when other objects already carry references to your first instance, this means nothing for your second instance. It has simply skipped over the initialization process as you would've normally defined it, and is now ready for consumption in other parts of your code. In no way is it referentially tied to the first instance.

In your first example, you have an invalid use case. My code performs a deep clone rather than a shallow one. The goal is to obtain a reference to a second instance that is completely independent of the first, but which also recursively contains all of the field-level values that the first instance had at the time of the snapshot. After that, they are disconnected from one another. Your array updating process will continue, and the second instance (the clone) will remain unaltered. This (in my opinion) is the desired behavior.

In your second example, you have another invalid use case. Thingie2 contains no logic to maintain the data integrity between field ID and property St. Clearly, another class maintains this relationship for us, or no such relationship will exist. Cloning Thingie2 takes you from having one class that depends on something else for its integrity, to having two instances of that class, both in the same non-functional boat.

Sorry for the C# here, I can tell you're a VB guy...

public class Thingie2
{
  protected string _id = Guid.NewGuid().ToString();

  public string St{ get{ return _id; } }
}


NOW your scenario works. The value of _id is captured and preserved in the second instance. The behavior of the type (which has nothing to do with its shape and/or values) is to mirror the protected field onto the public property. All cloned / deserialized instances now exhibit the behavior you've requested.

---if at first you don't succeed, skydiving is not for you.---

GeneralRe: Somewhat presumptuous to assume such cloning/comparison will work Pin
supercat94-Jun-09 11:29
supercat94-Jun-09 11:29 
GeneralRe: Somewhat presumptuous to assume such cloning/comparison will work Pin
John Batte4-Jun-09 17:02
John Batte4-Jun-09 17:02 
GeneralDoesn't work with public fields Pin
andrewducker23-Dec-08 9:16
andrewducker23-Dec-08 9:16 
GeneralRe: Doesn't work with public fields Pin
John Batte23-Dec-08 9:28
John Batte23-Dec-08 9:28 
GeneralNeedlessly O(n^2) Pin
Ed Brey22-Dec-08 15:10
Ed Brey22-Dec-08 15:10 
GeneralRe: Needlessly O(n^2) Pin
John Batte22-Dec-08 19:32
John Batte22-Dec-08 19:32 
Generalrecursion / multiple visitations Pin
sprucely19-Dec-08 9:35
sprucely19-Dec-08 9:35 
GeneralRe: recursion / multiple visitations Pin
John Batte19-Dec-08 10:43
John Batte19-Dec-08 10:43 
GeneralA hole??????? It's more like the Grand Canyon Pin
tonyt20-Dec-08 22:44
tonyt20-Dec-08 22:44 
AnswerRe: A hole??????? It's more like the Grand Canyon Pin
John Batte21-Dec-08 1:02
John Batte21-Dec-08 1:02 
AnswerRe: A hole??????? It's more like the Grand Canyon Pin
tonyt21-Dec-08 3:02
tonyt21-Dec-08 3:02 
GeneralNice Pin
cdkisa19-Dec-08 6:33
cdkisa19-Dec-08 6:33 
QuestionHuh? Pin
PIEBALDconsult19-Dec-08 3:01
mvePIEBALDconsult19-Dec-08 3:01 
AnswerRe: Huh? Pin
John Batte19-Dec-08 3:26
John Batte19-Dec-08 3:26 
GeneralRe: Huh? Pin
PIEBALDconsult19-Dec-08 5:54
mvePIEBALDconsult19-Dec-08 5:54 
GeneralJust a thought Pin
rcollina19-Dec-08 2:48
rcollina19-Dec-08 2:48 
GeneralRe: Just a thought Pin
John Batte19-Dec-08 3:55
John Batte19-Dec-08 3:55 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.