Quick Compression Utility for C# Byte Arrays
A quick but useful utility for compression and decompression of byte arrays

Introduction
To improve network bandwidth utilization for a high performance application, I determined that compressing a large-ish object (6 MB) prior to transmission improved performance of the network call by roughly a factor of 10x. However, the MSDN sample code for the GZipStream
class left a lot to be desired. It's arcane and poorly written and it took me far too long to understand it.
When I had finally figured out what was going on, I was left with, IMHO, a fairly useful little utility for generalized compression and decompression of byte arrays. It uses the GZipStream
class that comes standard as part of the System.IO.Compression
package.
My utility consists of a single class, Compressor
, with two static
methods, Compress()
and Decompress()
. Both methods take in a byte array as a parameter, and return a byte array. For Compress()
, the parameter is the uncompressed byte array, and the return is the compressed byte array and vice versa for Decompress()
.
During compression, the compressed bytes are prepended with an Int32
header containing the number of bytes in the uncompressed byte array. This header is used during decompression to allocate the byte array to be returned by Decompress()
.
Using the Code
Simply convert the object (or collection of objects) you wish to compress into a byte array. I find that a bit of custom serialization using the BitConverter
and/or the Buffer
classes can work well for this. For classes with a fixed record size (i.e. contains value types only, and no string
s), you can also dip down into the Marshal
class (see example below) to convert an object into a pointer and then copy the memory pointed to into your buffer.
Once you have your byte array, simply pass it to Compressor.Compress()
to get a compressed array for transmission. On the far end, simply pass the compressed byte array to Decompress()
and recover the original byte array. Voila!
//
// Sample Compression - how to send 100,000 stock prices across town in 1 second.
//
public struct StockPrice
{
public int ID;
public double bidPrice;
public double askPrice;
public double lastPrice;
public static int sz = Marshal.SizeOf(typeof(StockPrice));
public void CopyToBuffer(byte[] buffer, int startIndex)
{
IntPtr ptr = Marshal.AllocHGlobal(sz);
Marshal.StructureToPtr(this, ptr, false);
Marshal.Copy(ptr, buffer, startIndex, sz);
Marshal.FreeHGlobal(ptr);
}
public static StockPrice CopyFromBuffer(byte[] buffer, int startIndex)
{
IntPtr ptr = Marshal.AllocHGlobal(sz);
Marshal.Copy(buffer, startIndex, ptr, sz);
StockPrice stockPrice =
(StockPrice)Marshal.PtrToStructure(ptr, typeof(StockPrice));
Marshal.FreeHGlobal(ptr);
return stockPrice;
}
}
int Main()
{
// Assume that you are starting with a populated dictionary of StockPrice objects,
// which is an instance of Dictionary<int, StockPrice> and is keyed by the ID field
byte[] buffer = new byte[StockPriceDict.Count * StockPrice.sz];
int startIndex = 0;
foreach(StockPrice price in StockPriceDict.Values)
{
price.CopyToBuffer(buffer, startIndex);
startIndex += StockPrice.sz;
}
byte[] gzBuffer = Compressor.Compress(buffer);
// now uncompress the bytes and recover the original dictionary.
// This is *much* faster than
// using .NET Remoting or similar techniques
Dictionary<int, StockPrice> newStockPriceDict = new Dictionary<int, StockPrice>();
byte[] buffer1 = Compressor.Decompress(gzBuffer);
startIndex = 0;
while (startIndex < buffer1.Length)
{
StockPrice stockPrice = StockPrice.CopyFromBuffer(buffer1, startIndex);
newStockPriceDict[stockPrice.ID] = stockPrice;
}
}
Points of Interest
If there was any one thing I would improve about C# is its ability to manipulate objects as byte arrays. This aspect is absolutely critical for high performance computing and doesn't get enough respect from the C# product team. It seems like the functionality was only included for backwards compatibility with COM. However, it's probably the bit of code I rely on most when working in high-performance areas usually reserved for C++.
History
- v1.0 - 10th January, 2007