|
|
Comments and Discussions
|
|
 |

|
Dear Tim,
Thank you for this fast serialization technique, it came in handy when the 50% of my CPY usage was consumed from the default de/serialization methods. Now the application is much faster thank to you.
Since my application is a winservice and relies in data exchanging the most of the time is using your code to do the serialization. I have notice a great increase of memory usage and I am investigating almost all of my code for memory leaks, I have used 'using' even at the DataTables.
I have notice that SerializationWriter and Reader are disposable and they need a 'using' wraper to dispose the objects properly. I have made this change at my code but I haven't yet test it to see if this is the root of the memory leak but I think that you should use it too at your examples.
thank you,
Menelaos
|
|
|
|

|
Hi,
I am already using custom serialization (implemeting ISerializable) and it is not clear to me how/why your overwritten classes (SerializationWriter & SerializationReader) provide any additional benefit.
I've written a simple test object that contains just one field, of type 'byte[]'. Unfortunatelly the object's size is smaller when using the normal 'info->GetValue()' & 'info->AddValue()' methods VS using your serialization classes.
Can you explain in which scenario you get size reduction? Is it for 'string' type of fields? Do you know why?
As expected, you get a smaller size if you use smaller field identifiers, even if you loose in maintainability. So do use :
info->AddValue("c", this->Configuration);
instead of
info->AddValue("Configuration", this->Configuration);
Thanks
Vincent
|
|
|
|

|
I have an issue where different assemblies are opening a serialized data file. I can handle this just fine with SerializationBinder within the class being serialize/de-serialized but within your fastserializer ReadOject ObjectType.otherType you Deserialize the basdestream which is throwing an exception. I can't find a good way to intercept the binder since you're not returning a class. Any suggestions?
Thanks.
|
|
|
|

|
Is anyway we can serialize and deserialize class such as MethodInfo?
|
|
|
|
|

|
I've come up against a nasty little issue with reference cycles. If the serialised object, somewhere within its member variables (or their own member variables etc etc) has a reference to itself, the serialisation goes into an infinite loop. I was wondering if anyone has come up against this issue and has a solution. At the moment, I'm thinking of keeping references of already-serialised objects and it's rapidly starting to look like a lot of work.
|
|
|
|

|
It does not have to be a lot of work.
Just throw them in a hash table and check Contains.
The trade off is you are living with the speed of generic code for Hash and for the HashTable.Opacity, the new Transparency.
|
|
|
|

|
Nice article and it was very helpful. But I noticed a small bug in current post, dictionaries with value type of base[] or char[] are not properly added/retrieved. In fact most times retrieving a KeyValuePair<string,char[]> would throw an exception. Changing the call to (this.)Write instead of base.Write in WriteObject for byte[] and char[] fixes this.
-Parag
"Brahmana satyaṃ jagat mithyaa, jiyvo brahmaiva naaparah" -- Adi Shankara
"Civilization is the limitless multiplication of unnecessary necessities." -- Mark Twain
|
|
|
|
|

|
It is a great solution!
Also, I have a suggestion: I am not sure about the performance of switch(string), but it is probably not the best. So, is it possible to provide overloaded methods that accept strongly typed reference/value types. Well, at least some, like int, char, bool, long are the most common ones used:
public void Write(int i) {
Write ((byte)ObjType.int32Type);
Write (i);
}
This way the switch(string) and null check are both avoided.
Overall, great!!
|
|
|
|

|
Tim,
I may be missing something.
Person object:
string _FName;
string _LName;
Address _Address;
Address object:
string _City;
string _State;
Does the _Address member get serialized/deserialized without have to explicitly daisy chain the calls to the FromBinary/ToBinary?
Thank you,
Stephen
|
|
|
|

|
My Bad Tim,
This post is really for David!
Stephen
|
|
|
|

|
A little testing on my part has yielded these results..
Normal binary serialized length: 2433
Binary serialized length: 2433
Soap serialized length: 2433
Altered binary serialized length: 569
Altered soap serialized length: 1211
Running normal serialization test for 250000 iterations
Normal serialization done in 271.253472 uS per cycle
Running serialization test for 250000 iterations
Serialization done in 89.2511424 uS per cycle
Running altered serialization test for 250000 iterations
Altered serialization done in 76.0009728 uS per cycle
And to answer my original question, YES, the member of type Address is serialized/deserialized without any issues automatically.
Stephen
|
|
|
|

|
Indeed it is serialized. For non-standard types the binary serializer is invoked. If it's got its own fast serilizer then that will do the serialization. If not the standard serializer will be used.
Tim
|
|
|
|

|
My name Dung,I'm living at VietNam. Hello EveryBody!!!
I want to create a structure similar as BinaryFormatter and using Reflect to get about information such as: Fields,Properties,Constructor.... to Serialize to disk then Desrialize back.I don't known define How is it structuer? EveryBody can help me for this problem as soon as posible.I'm Thanks EveryOne...
|
|
|
|

|
Will you ever add support for a uint16[,] for example?
|
|
|
|

|
A while ago I started on an update and this did indeed include a uint16[,] serializer. I've been meaning to dust this off and publish the update for ages so if you can wait a few days I'll do so.
Tim
|
|
|
|

|
Hi, I tested personnally the solution you exposed in this article, and I found it really useful. Implementing custom serialization speeds up the process, and leave the binary file resulting with human-unreadable characters (for user details and passwords by example). This is sometimes enough, and sometimes not, but I won't address this problem here.
What I would like to talk about however, is the interoperability.
I am searching a way to implement a system that has a UI written in .NET C# and a Java based server side. I need to make the information get from a .NET UI through a JMS (understand Java Messaging Service) queue.
For asynchronous calls, I will use web services, for sure.
But when I need an aknowledgment from the server side (java), I would need a way to handle binary custom serialization / deserialization through .NET / Java.
Is there a way I can handle serialization / derserialization with a tool written for both Java and .NET. This way the binary stream will be uniform and understandable from both sides Java and .NET.
Thanks for your help, and feel free to ask more questions if I've been unprecise.
Searching for universal serialization / deserialization technique for .NET / Java communication.
|
|
|
|

|
Use XML throughout
|
|
|
|

|
Code Monkey wrote: Use XML throughout
I don't wanna use XML as on its own, it's too slow. It needs to be serialized and to be unversally understandable by .NET and Java.
Who has evr used such a solution ?
|
|
|
|

|
Depending on exactly how far you want to go, you should be able to implement your own custom formatting sinks and inject them into the .NET Remoting pipeline. You can perform custom serialization and deserialization this way. I am not entirely sure how Java's messaging services work, but you might also need to write a custom .NET Remoting channel. Ultimately, though, you should be able to directly connect the two and use them transparently (without needing to hack something onto both ends).
|
|
|
|
|

|
In c# use:
System.Net.IPAddress.HostToNetworkOrder()
System.Net.IPAddress.NetworkToHostOrder()
Java always uses network byte order. (Big Endian)
On two occasions I have been asked [by members of Parliament], 'Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?' I am not able rightly to apprehend the kind of confusion of ideas that could provoke such a question. - Charles Babbage
|
|
|
|

|
Did you find any solution to the problem you already described?
TIA,
|
|
|
|

|
Hi,
I have read your article, and look at the code. The idea is good and the performance gains looks promissing. I must admit I haven't done extensive testing yet - so excuse me if I the answer to my question is obvious.
My question is: Does it support DataSets?
Best regards,
Michael Hansen
|
|
|
|

|
I modified some of the classes a little to demonstrate how to improve performance even more. These alterations resulted in a reduction in the size of the serialized object by about 21% for the SOAP version and about 28% for the binary version. Additionally, the time required to perform the serialization and deserialization test was reduced by nearly 47%. Here is the test application's output for both versions:
Binary serialized length: 433
Soap serialized length: 1098
Altered binary serialized length: 311
Altered soap serialized length: 867
Running serialization test for 250000 iterations
Serialization done in 135.625 uS per cycle
Running altered serialization test for 250000 iterations
Altered serialization done in 72.125 uS per cycle
I'll try to list all of the alterations here...
- SerializationWriter class: Moved some code from the AddToInfo method into a method named ToBinary. ToBinary is used by the TestObject class' ToBinary method. The updated code is as follows:
public void AddToInfo(SerializationInfo info)
{
byte[] b = ToBinary();
info.AddValue("X", b, typeof(byte[]));
}
public byte[] ToBinary()
{
return ((MemoryStream)BaseStream).ToArray();
}
- SerializationReader class: Added an overload for the GetReader static method and cleaned up duplicate code. Here is the new source:
public static SerializationReader GetReader(SerializationInfo info)
{
byte[] byteArray = (byte[])info.GetValue("X", typeof(byte[]));
return GetReader(byteArray);
}
public static SerializationReader GetReader(byte[] buffer)
{
return new SerializationReader(new MemoryStream(buffer));
}
- TestObject class: Added methods to make use of the modified SerializationReader and SerializationWriter classes.
private TestObject(byte[] buffer)
{
SerializationReader sr = SerializationReader.GetReader(buffer);
id1 = sr.ReadInt64();
id2 = sr.ReadInt64();
id3 = sr.ReadInt64();
s1 = sr.ReadString();
s2 = sr.ReadString();
s3 = sr.ReadString();
s4 = sr.ReadString();
dt1 = sr.ReadDateTime();
dt2 = sr.ReadDateTime();
b1 = sr.ReadBoolean();
b2 = sr.ReadBoolean();
b3 = sr.ReadBoolean();
e1 = sr.ReadByte();
d1 = sr.ReadDictionary();
}
public byte[] ToBinary()
{
SerializationWriter sw = SerializationWriter.GetWriter();
sw.Write(id1);
sw.Write(id2);
sw.Write(id3);
sw.Write(s1);
sw.Write(s2);
sw.Write(s3);
sw.Write(s4);
sw.Write(dt1);
sw.Write(dt2);
sw.Write(b1);
sw.Write(b2);
sw.Write(b3);
sw.Write(e1);
sw.Write(d1);
return sw.ToBinary();
}
public static TestObject FromBinary(byte[] buffer)
{
return new TestObject(buffer);
}
- MainClass class: Added code to the PrintSize method to show the altered serialization sizes. Also, added a AlteredPerfTest method that mimics the PerfTest method, using the new serialization method instead. Finally, called the AlteredPerfTest method in the Main of the application. Here are the modified methods:
static void PrintSize()
{
TestObject testObj = new TestObject();
MemoryStream ms = new MemoryStream();
new BinaryFormatter().Serialize(ms, testObj);
Console.WriteLine("Binary serialized length: {0}", ms.Length);
ms.Position = 0;
new SoapFormatter().Serialize(ms, testObj);
Console.WriteLine("Soap serialized length: {0}", ms.Length);
ms.Position = 0;
ms.SetLength(0);
new BinaryFormatter().Serialize(ms, testObj.ToBinary());
Console.WriteLine("Altered binary serialized length: {0}", ms.Length);
ms.Position = 0;
new SoapFormatter().Serialize(ms, testObj.ToBinary());
Console.WriteLine("Altered soap serialized length: {0}", ms.Length);
}
static void AlteredPerfTest(int count)
{
Console.WriteLine("\nRunning altered serialization test for {0} iterations", count);
TestObject obj1 = new TestObject();
DateTime t = DateTime.Now;
for (int i = 0; i < count; i++)
{
MemoryStream ms = new MemoryStream();
BinaryFormatter bf = new BinaryFormatter();
bf.Serialize(ms, obj1.ToBinary());
ms.Position = 0;
TestObject obj2 = TestObject.FromBinary((byte[])bf.Deserialize(ms)); // deserialize again
}
TimeSpan ts = DateTime.Now - t;
Console.WriteLine("Altered serialization done in {0} uS per cycle", ts.TotalMilliseconds * 1000.0 / count);
} // AlteredPerfTest
static void Main(string[] args)
{
PrintSize();
PerfTest(250000);
AlteredPerfTest(250000);
}
Okay. I'm pretty sure I included all of my changes. I tried to leave as much of the original code intact as possible so that the comparison would be as accurate as it could be. The main difference in the new code is that instead of serializing and deserializing the TestObject (a user-defined type) these operations are performed on a byte array (a built-in type.)
I found this article because I was considering posting my first article on the same topic. My method is very similar, utilizing the BinaryFormatter, but with the addition that I included in this response. Hopefully, it will help those of you that need to squeeze out some extra performance.
Happy coding,
David
|
|
|
|

|
Thank you for your class….it’s helping recover from a MarshalByRefObject mixed with Serialization hangover.
What is the best way to handle base classes? Or abstract classes?
I would LIKE to keep serialization of the base class variables in the base class, but am getting an exception whining about to serialize the same class twice with the AddToInfo call.
thanks again for the class!
Gene
|
|
|
|

|
The exception you're getting is because in AddToInfo I simply add the serialized data to SerializationInfo with the (not very imaginatove) key "X". There will be a duplicate key exception on the second call. The simple and quick way to get you running is to add a key parameter to the SerializationWriter's AddToInfo method eg:
public void AddToInfo (SerializationInfo info, string key) {
byte[] b = ((MemoryStream)BaseStream).ToArray();
info.AddValue (key, b, typeof(byte[]));
}
And similarly the SerializationReader's GetReader method, eg:
public static SerializationReader GetReader (SerializationInfo info, string key) {
byte[] byteArray = (byte[])info.GetValue (key, typeof(byte[]));
MemoryStream ms = new MemoryStream (byteArray);
return new SerializationReader (ms);
}
It doesn't matter what the key string is, as long as it's different between base and derived classes, and the same within each class. Using something like this.GetType().Name would work ncely.
A better answer may be to do some clever reflection to do this automatically, but I need to think about that.
Good luck!
Tim
|
|
|
|

|
Nice work!
Integral value types are fixed in size and therefor don't need a byte prefix, and the serializer writes them using the base raw writer.
Is there a way to leverage that into the Collectin serialization / deserialization? Maybe only for a subset T,U : where T ValueType or something?
Also, the deserialization does not preallocate the collection size
public IDictionary ReadDictionary()
{
int count = ReadInt32();
if (count < 0) return null;
IDictionary d = new Dictionary(); <=== HERE: new Dictionary(count)
for (int i = 0; i < count; i++) d[(T)ReadObject()] = (U)ReadObject();
return d;
}
This will induce excessive GC and could harm performance for any IDictionary of considerable size.
(See Unintentional Excessive Garbage Collection)
Nuri
|
|
|
|

|
Hi Tim
Excellent work - I was going to do something similar myself but this gave me a great start for a proof of concept. Have you added any further optimizations since the release code?
Here are some optimizations I have found useful:
1) In ReadObject() - put a specific test for the null case at the top of the list rather than let it drop to the default case. In fact maybe these should put in order of likely usage.
2) BinarySerializer has a protected method for storing ints in a compact form. I have made it available with the following code. Very useful for things like counts where the number is usually small but could be large.
public new int Read7BitEncodedInt()
{
return base.Read7BitEncodedInt();
}
3) This one was the killer for me. I am serializing a lot of object[] objects which contained all values and was using WriteObject() but adding this specific code produced amazing results! Note the use of Read7BitEncodedInt as mentioned above - only takes a single byte for all of my usage! Do the same for ReadByteArray() and ReadChar() too.
public object[] ReadObjectArray()
{
int count = base.Read7BitEncodedInt();
object[] result = new object[count];
for(int i = 0; i < count; i++)
{
result[i] = ReadObject();
}
return result;
}
Future optimizations - I've not tried these yet but will soon.
1) The ObjType enum is only using a fraction of the 255 available entries. I am going to try using some for special casing. ie a ZeroInt32Type/ZeroInt16Type etc. (one for each numeric type). Possibly the same for One/MinValue/MaxValue. String could have EmptyStringType; DateTime could have MinValue/MaxValue and EmptyTrueBool/FalseBool would also save some space. Anything where a type and a 'common' value could be defined really.
2) In the ReadObjectArray() method above, I mentioned I am storing values read from a database and there may be cases where there are 'runs' of null values. By having a specific "NullListType" ObjType and storing a 7BitEncodedInt for runs of 3 or more null values there is potential for a lot of saved space depending on the data.
Cheers
Simon
Cheers
Simon
|
|
|
|

|
Hello,
please be aware that array lengths may exceed the range of Int32.
This (rare) case will mess up your serialization stream thoroughly as you always write the array length as an Int32 to the stream.
Cheers,
Michael
|
|
|
|

|
I've added a bit code to the code here. I kept forgetting to add the writer to the info, and it turned out the error was a bit hard to track down
(at least until I found out what "__X__" not found meant :-p
this code will throw an error if you forget to add the writer to the info. The error will be trown when the object is disposed.
if your using a using cluase, you know when the object is disposed
protected override void Dispose(bool disposing)
{
if (!Added)
throw new InvalidOperationException("Writer was not added to info");
base.Dispose(disposing);
}
private bool Added;
/// Adds the SerializationWriter buffer to the SerializationInfo at the end of GetObjectData().
public void AddToInfo(SerializationInfo info)
{
byte[] b = ((MemoryStream)BaseStream).ToArray();
info.AddValue(MEMBERNAME, b, typeof(byte[]));
Added = true;
}
|
|
|
|

|
Here is my suggestion for supporting the serialization of struct Guid. Simply treat it as a byte[] array. Let me know if there is a better way to do this.
Add these two methods to FastSerializer.cs:
/// Writes a Guid to the buffer.
public void Write( Guid value ) { Write( value.ToByteArray() ); }
/// Reads a Guid from the buffer.
public Guid ReadGuid() { return new Guid( ReadByteArray() ); }
Note: You may have to rename the "Write( ICollection c )" method to "WriteList" because it is firing instead of the "Write( byte[] b )" method (see previous post on this).
|
|
|
|

|
Thanks to the original author - code is very useful. In addition to the Guid stuff mentioned earlier if you wish to use Guids in a generic list or as indexers for a dictionary you should also do the following :
Quote:
Note: You may have to rename the "Write( ICollection c )" method to "WriteList" because it is firing instead of the "Write( byte[] b )" method (see previous post on this).
1. You do need to make the above change
2. Add a new enum value guidType to ObjType
3. Add the following to WriteObject
case "Guid": Write((byte)ObjType.guidType);
Write((Guid)obj);
break;
4. Add the following to ReadObject
case ObjType.GuidType : return ReadGuid ();
|
|
|
|

|
I was testing out a byte[] array in TestObject, and it wasn't working properly. Turns out the "Write( ICollection c )" method fires instead of the "Write( byte[] b )" method.
As a workaround, I renamed the first method "WriteList". Hopefully, someone can post a better solution here so we can continue to use "Write" with polymorphism.
Thanks.
|
|
|
|

|
Well spotted! Looks like a framework bug to me. There's no way it should latch onto the Write(ICollection) overload when there is an exact match for Write(byte[]). I'll fire this off to MS and see what they say.
The intention (having implemented the IDictionary overload) was to provide support for IList, but Write(IList) has the same problem. One solution is to use the concrete class, ie change Write(ICollection) to Write(List).
This works fine, but I'm also wondering about handling array types other than just byte[], so I may come up with something better yet.
Cheers,
Tim
|
|
|
|

|
Just a small tip, you can use TypeCode enumeration instead of redefine ObjType
TypeCode typeCode = Type.GetTypeCode(obj.GetType());
switch(typeCode)
{
case TypeCode.String:
...
}
|
|
|
|

|
I did have a quick look at TypeCodes, but opted for a simple switch on name for ease of extensibility as I anticipate adding dictionaries, lists etc in due course. On second look I see that TypeCodes are extensible via the IConvertible interface. Interesting area - I'll have a play! Thanks for the hint.
Tim
|
|
|
|

|
It would be great to see support for .NET 2.0 nullable types in this library. Also, it seems to me that if SerializationWriter.WriteObject were instead just another overload for SerializationWriter.Write, then one could avoid reflecting on the types of strongly-typed collections and dictionaries as is currently being done, further improving performance. The overload that takes an Object parameter would then be the fallback method if no type-specific overload of Write was found.
|
|
|
|

|
I agree that nullable type support needs adding. I'll get around to that soon.
Having a plain Write(object) certainly looks like a good idea too.
Strongly typed dictionaries/lists can certainly be done without the need to hold the object type, but in my specific case I needed a Dictionary, so only half strongly typed, so I implemented the general case for simplicity. I'll look into the alternatives.
Thanks,
Tim
|
|
|
|

|
We are using CSLA for our business objects, which uses remoting to communcate back to the server. There is an issue with standard serialisation where it searches a collection of objects for duplicates so that it does not have to send duplicate data. When we have more that 300 objects in a collection performance dies.
This article gives us a technique to solve this problem. We are more concerned with the speed of the serialisation than the amount of data sent, so duplicates are not an issue.
Even better we can apply it to only the objects that need it.
We use code generation to create our business objects so the extra code is simple to add.
Great article.
Scott Ellis
|
|
|
|

|
Hi,
we also use CSLA but we did not do any performane tests so far when
using remoting.
Did you do any perfomance testing with csla objects or have any
recommendation on how to use it?
Thanks for any info
Frank
|
|
|
|

|
We have used CSLA for about 18 months now, the performance is not as high as I had hoped, but has not been bad enough that we have got around to dealing with it - until now.
We are now doing some profiling, and some of the issues are CSLA related (use of reflection in some areas where we do not need to as we code generate our objects) which will be easy to fix, and other issues which are not CSLA at all, but how we have hooked it into our UI.
Many people suggest that you do not try to optimise anything until you have profiled your app, and after doing this I agree. The speed issues are almost never where you think they are. They are often much easier to fix as well!
We did not do performance testing.
Our app runs about 20 clients, who are using it full time to manage building maintenance work.
Scott
|
|
|
|

|
Hi,
thanks for the information.
I agree that profiling the app is necesarry to do
any optimizations.
Greetings
Frank
|
|
|
|

|
Good article.
I think speed benefits largely worth the cost ( implement 2 short methode ).
I will try this, in my own work.
Thanks
|
|
|
|
|

|
Tim,
C# != C++ ?
ATB
Jerry
|
|
|
|

|
Absolutely right. Not my doing I'm afraid. The categorisation is almost completely wrong as it's not "C++/MFC", nor is it "ASP.NET", nor "SOAP and XML". I've dropped the site organisers a line and asked for this to be corrected.
Thanks,
Tim
|
|
|
|

|
It's no wonder the .NET serialization is slower. But I prefer it as I do not wamt to write all those stuff by myself. This would be the last thing I would do during optimazation.
I hate deep copies too as it is so much code I have to write for such a simple thing.
|
|
|
|

|
Im working on irda transfers, using two pocket pc...
I know how hard and boring is serialization and deserialization.
The good thing is that you know how you're serializing...
Eg: If you have a slower internet connection and need to serialize those objects through the internet you could use the sharpziplib to compress the byte array before send.
The .NET is a wonderful technology, you can implement such thing in an easy-to-use way.
Im just telling that the serialization stuff isnt a bad thing and in some project you really need to use. Ah, the article got my five
Sorry my awful english.
|
|
|
|
 |
|
|
General News Suggestion Question Bug Answer Joke Rant Admin
Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.
|
Transparently boosting serialization performance and shrinking the serialized object's size.
| Type | Article |
| Licence | |
| First Posted | 19 May 2006 |
| Views | 126,241 |
| Bookmarked | 129 times |
|
|