Compression and Decompression Snippets





0/5 (0 vote)
A few extension methods to help with compressing either string or just byte arrays (which could be files, images, objects, basically, everything can be represented as a byte array)
Introduction
In this post, I would like to look at a few extension methods to help with compressing either string
or just byte
arrays (which could be files, images, objects, basically, everything that can be represented as a byte array).
The Backstory
The need for this came when I was participating in a hackathon (in which players need to code bots in arena style matches), since you can only observe the play and output diagnostic messages (no debugging since it takes 2 or more to play), I opted to write out as a diagnostic the current state of the game, which then I could copy into my local implementation and run that particular round.
The problem was that the state of the game might grow so big under certain circumstances that the system would cut off part of the output (which is a pain but it is fair, you don’t want someone DOS-ing your game right?), so I ended up writing a quick snippet that would compress my output and which then I would paste into a file and run a unit test that would decompress that file content and debug that round.
The Snippets
One thing to note, the snippets that were initially written were a bit more compact and hacky, but for the purpose of this post, I rewrote them to be more extensible and configurable borrowing some ideas from functional programming and Python (bonus points if you can spot them).
Also as a note, the full snippet will be presented at the end.
Compression Snippet
using System;
using System.IO;
using System.IO.Compression;
using System.Text;
public static class DebugExtensions
{
private static readonly Func DefaultEncoding = (s) => Encoding.UTF8.GetBytes(s);
private static readonly Func DefaultCompression = (b) => b.Compress();
public static string Compress(this string instance,
Func encodingFunction = null, Func compressionFunction = null)
{
encodingFunction = encodingFunction ?? DefaultEncoding; // set the encoding,
// either use the provided one or the default encoding if none was provided
compressionFunction = compressionFunction ?? DefaultCompression; // set the compression
// algorithm, either use the provided one or the default compression if none was provided
byte[] buffer = encodingFunction(instance); // encode the provided string into
// its byte array form
byte[] compressedBuffer = compressionFunction(buffer); // compress the byte array
return Convert.ToBase64String(compressedBuffer); // return the compresses array as a
// Base64 string
}
public static byte[] Compress(this byte[] buffer)
{
byte[] compressedData;
using (MemoryStream memoryStream = new MemoryStream())
{
using (GZipStream gZipStream = new GZipStream(memoryStream, CompressionMode.Compress, true))
{
gZipStream.Write(buffer, 0, buffer.Length);
}
compressedData = memoryStream.ToArray();
}
byte[] gZipBuffer = new byte[compressedData.Length + 4]; // create the buffer array
// that will hold the compressed bytes, plus some additional padding
// to hold the length of the array
Buffer.BlockCopy(BitConverter.GetBytes(buffer.Length), 0, gZipBuffer, 0, 4); // copy the
// length of the original byte array into the header as 4 bytes (the size of an int)
Buffer.BlockCopy(compressedData, 0, gZipBuffer, 4, compressedData.Length); // copy the
// rest of the compressed string
return gZipBuffer;
}
}
As you can see, this allows you to not only compress a string
or a byte
array as an extension method but also control how the text is encoded or compressed, in case you might have a more optimal approach for your situation.
Decompression Snippet
using System;
using System.IO;
using System.IO.Compression;
using System.Text;
public static class DebugExtensions
{
private static readonly Func DefaultDecoding = (b) => Encoding.UTF8.GetString(b);
private static readonly Func DefaultDecompression = (b) => b.Decompress();
public static string Decompress(this string instance,
Func decodingFunction = null, Func decompressionFunction = null)
{
decodingFunction = decodingFunction ?? DefaultDecoding; // set the encoding,
// either use the provided one or the default encoding if none was provided
decompressionFunction = decompressionFunction ?? DefaultDecompression; // set the
// decompression algorithm, either use the provided one or the default
// compression if none was provided
byte[] gZipBuffer = Convert.FromBase64String(instance); // transform the Base64
// string to a byte array
byte[] buffer = decompressionFunction(gZipBuffer); // decompress the byte array
return decodingFunction(buffer); // return the decompressed string
}
public static byte[] Decompress(this byte[] buffer)
{
using (MemoryStream memoryStream = new MemoryStream())
{
int dataLength = BitConverter.ToInt32(buffer, 0); // read the length of the
// decompressed string
memoryStream.Write(buffer, 4, buffer.Length - 4); // writing the remaining string
byte[] decompressedBuffer = new byte[dataLength];
memoryStream.Position = 0;
using (GZipStream gZipStream = new GZipStream(memoryStream, CompressionMode.Decompress))
{
gZipStream.Read(decompressedBuffer, 0, decompressedBuffer.Length);
}
return decompressedBuffer;
}
}
}
The same configuration options you would also have for decompressing, one thing to note (if you don’t want funky outcomes) is to encode/decode compress/decompress in pairs, one encoding for compression and a different encoding for decompression.
The Full Code
using System;
using System.IO;
using System.IO.Compression;
using System.Text;
public static class DebugExtensions
{
private static readonly Func DefaultEncoding = (s) => Encoding.UTF8.GetBytes(s);
private static readonly Func DefaultCompression = (b) => b.Compress();
private static readonly Func DefaultDecoding = (b) => Encoding.UTF8.GetString(b);
private static readonly Func DefaultDecompression = (b) => b.Decompress();
public static string Compress(this string instance, Func encodingFunction = null,
Func compressionFunction = null)
{
encodingFunction = encodingFunction ?? DefaultEncoding; // set the encoding,
// either use the provided one or the default encoding if none was provided
compressionFunction = compressionFunction ?? DefaultCompression; // set the compression
// algorithm, either use the provided one or the default compression if none was provided
byte[] buffer = encodingFunction(instance); // encode the provided string into
// its byte array form
byte[] compressedBuffer = compressionFunction(buffer); // compress the byte array
return Convert.ToBase64String(compressedBuffer); // return the compresses array
// as a Base64 string
}
public static byte[] Compress(this byte[] buffer)
{
byte[] compressedData;
using (MemoryStream memoryStream = new MemoryStream())
{
using (GZipStream gZipStream = new GZipStream(memoryStream, CompressionMode.Compress, true))
{
gZipStream.Write(buffer, 0, buffer.Length);
}
compressedData = memoryStream.ToArray();
}
byte[] gZipBuffer = new byte[compressedData.Length + 4]; // create the buffer array
// that will hold the compressed bytes, plus some additional
// padding to hold the length of the array
Buffer.BlockCopy(BitConverter.GetBytes(buffer.Length), 0, gZipBuffer, 0, 4); // copy the
// length of the original byte array into the header as 4 bytes (the size of an int)
Buffer.BlockCopy(compressedData, 0, gZipBuffer, 4, compressedData.Length); // copy the
// rest of the compressed string
return gZipBuffer;
}
public static string Decompress(this string instance, Func decodingFunction = null,
Func decompressionFunction = null)
{
decodingFunction = decodingFunction ?? DefaultDecoding; // set the encoding,
// either use the provided one or the default encoding if none was provided
decompressionFunction = decompressionFunction ?? DefaultDecompression; // set the
// decompression algorithm, either use the provided one
// or the default compression if none was provided
byte[] gZipBuffer = Convert.FromBase64String(instance); // transform the Base64 string
// to a byte array
byte[] buffer = decompressionFunction(gZipBuffer); // decompress the byte array
return decodingFunction(buffer); // return the decompressed string
}
public static byte[] Decompress(this byte[] buffer)
{
using (MemoryStream memoryStream = new MemoryStream())
{
int dataLength = BitConverter.ToInt32(buffer, 0); // read the length of the
// decompressed string
memoryStream.Write(buffer, 4, buffer.Length - 4); // writing the remaining string
byte[] decompressedBuffer = new byte[dataLength];
memoryStream.Position = 0;
using (GZipStream gZipStream = new GZipStream(memoryStream, CompressionMode.Decompress))
{
gZipStream.Read(decompressedBuffer, 0, decompressedBuffer.Length);
}
return decompressedBuffer;
}
}
}
Conclusion
As you might notice, this would bring a nice benefit in any situation you would like to have something compressed, off the top of my head, another time where I required this was when on a previous legacy project in which we were sending whole gigantic queries via query strings (encrypted of course ), we ran out of space and Internet Explorer was causing us grief.
I hope you enjoyed this, thank you and happy coding.