This article demonstrates how to convert bytes into the user-defined data structures using dynamically emitted code.
Sasha Goldshtein wrote an excellent article on this topic, analyzing various ways to read user-defined structs from byte arrays. This article builds on his work and proposes a faster and more generic alternative using code generation. The attached code includes both Sasha's original code and an open source toolkit that helps with the code generation.
The fastest solution shown by Sasha's article was using the
fixed keyword for non-generic types:
static unsafe Packet ReadUsingPointer(byte data)
fixed (byte* packet = &data)
To make this truly useful, we need to have a generic method:
static T Packet ReadUsingPointer<T>(byte data)
fixed (byte* packet = &data)
return *(T*)packet; }
Unfortunately, due to the limitations of C#, it is not possible to create a generic method
T ReadIntoStruct<T>(byte data), so replacing
Packet with generic
T simply would not compile, even if
T is restricted to value types (
struct). To compile,
T must adhere to a different set of requirements set forth in §18.2 of the C# language specifications v3.0:
An unmanaged-type is any type that isn't a reference-type and doesn't contain reference-type fields at any level of nesting. In other words, an unmanaged-type is one of the following:
• sbyte, byte, short, ushort, int, uint, long, ulong, char, float, double, decimal, or bool.
• Any enum-type.
• Any pointer-type.
• Any user-defined struct-type that contains fields of unmanaged-types only.
strings are not in that list, even though you can use them in structs. Fixed-size arrays of unmanaged-types are allowed.
The proposed solution is to dynamically generate identical method but for a given type, and use a generic interface
ICall<T>. Alternatively, a
static method is also generated to compare the cost of calling
static and interface methods.
To avoid any strange behavior when
<code>T does not satisfy unmanaged-type requirements, we have to validate type
T recursively against all of the rules -
TypeExtensions.ThrowIfNotUnmanagedType(). I just hope some day the object
Type will have a simple property to check instead of all the code I had to write, but for now it's an extension method on
Common Intermediary Language (CIL) is fairly complex, but deep understanding is not needed to accomplish method generation. First, I used Reflector to view the CIL generated for the prototype methods. Then, I adapted an excellent OSS library Business Logic Toolkit for .NET to emit CIL identical to the prototype but for a different type. This article gives a good introduction on how to use toolkit's emit functionality. In my code, I changed all the helper classes into extension methods, making the process much more streamlined.
Here is what the method generation looks like. Note the replacement of
ReadingStructureData.Packet with the type of another item.
var emit = methodBuilder.GetILGenerator();
var l0 = emit.DeclareLocal(typeof (byte).MakeByRefType(), true);
var l1 = emit.DeclareLocal(itemType);
var L_0012 = emit.DefineLabel();
.ldarg(methodBuilder, param) .ldc_i4_0() .ldelema(typeof (byte)) .stloc(l0) .ldloc(l0) .conv_i() .ldobj(itemType) .stloc(l1) .leave_s(L_0012) .MarkLabelExt(L_0012) .ldloc(l1) .ret() ;
Using the Code
The sample creates two methods - one as an interface, which requires an instance of an object, and a delegate to
Func<byte, Packet> staticDelegate;
WrapperFactory.Instance.CreateDynamicMethods(out interfaceObj, out staticDelegate);
var result = staticDelegate(sourceData); var result = interfaceObj.ReadItem(sourceData);
Even though the numbers change from run to run, the overall results are that generated code is close in speed to prototype. Also note the time it takes to emit the new code. Even though the time would be reduced when multiple types are wrapped, it is still significant.
Calling static prototype: 199.00
Calling interface prototype: 214.00
Creating dynamic methods: 14.00
Calling generated static: 213.00 (07% slower than static prototype)
Calling generated interface: 221.00 (03% slower than interface prototype)
Points of Interest
Even though .NET specification states that the statements
fixed(byte *p = array) and
fixed(byte *p = &array) are equivalent, IL showed a completely different story. The first statement generated significantly more IL instructions. An issue has been created at Microsoft Connect. You can view the IL code difference there.
- 2/14/2009 - Initial upload
- 2/19/2009 - Updated to remove external code dependencies