Click here to Skip to main content
Click here to Skip to main content

Better Than Zip Algorithm For Compressing In-Memory Data

, 23 May 2006
Rate this:
Please Sign up or sign in to vote.
An article on the in-memory data compression engine for .NET

Introduction

Sometimes you may need to compress data in memory without creating disk files. For example, it is possible to compress binary data in a CDATA section of an XML file. Also, you may want to compress data transmitted between the client and server machines over a low-bandwidth connection.

The proposed classes implement the original compression algorithm. This method is based on the "DEFLATE Compressed Data Format Specification" (RFC 1951) with some enhancements. It works slightly better than the generally used zip compression. The algorithm is especially effective on Centrino and other platforms with 2 MB L2 cache. The above picture shows results of the mscorlib.xml file compression/decompression on a machine with Pentium M processor (1.8 GHz). Running the same test on a machine with AMD Athlon 2100+ processor (with 256 KB L2 cache) gives 721 ms as the compression time and 40 ms as the decompression time. It is interesting that Intel and AMD platforms have such an evident difference in performance for various operations (here 110 ms on the Pentium M is quite imperfect data, however, because of the big inaccuracy).

You can compare this compression engine with the standard DeflateStream compressor added in .NET 2.0. To do so, please open the source code for the demo project (Form1.cs) and comment/uncomment several code blocks. They are marked in the source code. Then, try the demo project with various data files and compare results with the MemCompressor engine. You will see, the standard DeflateStream engine is rather poor.

Using MemCompressor

The MemCompressor assembly includes two main classes: AcedDeflator and AcedInflator. The AcedDeflator class has the Compress method. This method accepts a byte array and returns new byte array with the same data in the compressed form. For example:

private int _originalSize;
private byte[] _originalData;

private int _originalChecksum;
private byte[] _compData;
private int _compSize;

private void CompressData()
{
    _originalChecksum = AcedUtils.Adler32(_originalData, 0, _originalSize);
    _compData = AcedDeflator.Instance.Compress(_originalData, 0, _originalSize,
        AcedCompressionLevel.Normal, 0, 0);
    _compSize = _compData.Length;
}

The AcedInflator class works quite the contrary. It has the Decompress method which accepts a binary array with compressed data. This method restores the original data in the output array.

private byte[] _decompData;
private int _decompSize;
private int _decompChecksum;

private void DecompressData()
{
    _decompData = AcedInflator.Instance.Decompress(_compData, 0, 0, 0);
    _decompSize = _decompData.Length;
    _decompChecksum = AcedUtils.Adler32(_decompData, 0, _decompSize);
    Debug.Assert(_decompChecksum == _originalChecksum);
}

The AcedDeflator.Compress method accepts, among other, the compressionLevel parameter. This parameter chooses balance between the speed and the compression ratio. Here you may pass one of the following values:

  • Store (simple copying data to the output array without compression)
  • Fastest (poor compression ratio but the best speed)
  • Fast (optimal for less-redundant data)
  • Normal (better choice for higher-redundant data, such as texts)
  • Maximum (slow but has the best compression ratio)

AcedDeflator Class

public class AcedDeflator

Provides methods for compressing binary data with a combination of the LZ77 algorithm and Huffman coding similar to the method described in RFC 1951. You should synchronize calls to the methods of this class in a multithreaded application.

public AcedDeflator()

Creates a new instance of the AcedDeflator class. It is not recommended to call this constructor directly. When possible, please use a single cached instance of the AcedDeflator class which is returned by the Instance static property.

public static AcedDeflator Instance { get; }

Returns a cached instance of the AcedDeflator class. Such an instance occupies a lot of memory (a little more than 2 MB). So, a single instance is created and cached in an internal static field of this class. This instance can be used each time when you want to compress some data. The only exception may be a multithreaded application where you compress data in several threads simultaneously. If so, you should create each instance with the regular constructor.

public byte[] Compress(byte[] sourceBytes, int sourceIndex, int sourceLength,
    AcedCompressionLevel compressionLevel, int beforeGap, int afterGap)

Compresses binary data passed in the sourceBytes array starting from sourceIndex with sourceLength bytes in length. The compression ratio depends on the compressionLevel parameter. This method creates a new byte array which is enough to store the compressed data. You may also reserve beforeGap bytes before the compressed data block and afterGap bytes after the compressed data block. This may be useful if you want to supply a custom header and/or tail to the compressed data and you don't want to reallocate memory unnecessarily. The result byte array consists of (beforeGap + Compressed_Data_Length + afterGap) bytes.

public int Compress(byte[] sourceBytes, int sourceIndex, int sourceLength,
    AcedCompressionLevel compressionLevel, byte[] destinationBytes,
    int destinationIndex)

This method is similar to the previous one but it doesn't allocate the memory for the output array. The output array is passed as the destinationBytes argument. The compressed data will be stored in that array starting from the destinationIndex. The maximum length of the compressed data is (sourceLength + 4) bytes. The output array must have enough space. This method returns the number of bytes stored in the output array. If you pass null in the destinationBytes argument, the method performs "dummy" compression. In such a way, you can find out the length of the output data. However, the "dummy" compression takes almost the same time as the regular compression.

public static void Release()

Call this method to release an internal static reference to the cached instance of the AcedDeflator class. After that, you may call GC.Collect() to free memory which was occupied by that cached instance.

AcedInflator Class

public class AcedInflator

Provides methods for decompressing binary data which was compressed using the AcedDeflator class. You should synchronize calls to the methods of this class in a multithreaded application.

public AcedInflator()

Creates a new instance of the AcedInflator class. It is not recommended to call this constructor directly. When possible, to avoid unnecessary memory reallocation, please use a single cached instance of the AcedInflator class which is returned by the Instance static property.

public static AcedInflator Instance { get; }

Returns a cached instance of the AcedInflator class. Such an instance is created when you access this property for the first time. Then, the instance is cached in an internal static field of this class. You can use it to decompress any data in a single-threaded application. In a multithreaded application, please create an AcedInflator for each the thread with the regular constructor.

public static int GetDecompressedLength(byte[] sourceBytes, int sourceIndex)

Returns the minimum length, in bytes, of the binary array where you can store the decompressed data for the specified compressed binary array (sourceBytes). The sourceIndex argument specifies the start index in the sourceBytes array from which the compressed data begins.

public byte[] Decompress(byte[] sourceBytes, int sourceIndex,
    int beforeGap, int afterGap)

Decompresses binary data passed in the sourceBytes array starting from sourceIndex. This method creates a new byte array which is enough to store the decompressed data. You may also reserve beforeGap bytes before the decompressed data block and afterGap bytes after the decompressed data block. This may be useful if you want to supply a custom header and/or tail to decompressed data and you don't want to reallocate memory unnecessarily. The result array will consist of (beforeGap + Decompressed_Data_Length + afterGap) bytes.

public int Decompress(byte[] sourceBytes, int sourceIndex,
    byte[] destinationBytes, int destinationIndex)

The same as the previous method but the memory for the output array must be allocated before calling this method. You should pass the destination array in the destinationBytes argument. The destinationIndex specifies offset in that array from which the decompressed data will be stored. The output array must have enough space to store all the data. To obtain the decompressed data size, please call the GetDecompressedLength method of this class. The Decompress method returns the number of bytes stored in the output (destinationBytes) array. If you pass null in the destinationBytes argument, this method does nothing, just returns the decompressed data length, in bytes.

public static void Release()

Call this method to release an internal static reference to the cached instance of the AcedInflator class. After that, you may call GC.Collect() to free memory which was occupied by that cached instance.

AcedUtils Class

public class AcedUtils

This class has one public static method to compute the Adler-32 checksum.

public static int Adler32(byte[] bytes, int offset, int length)

Computes the Adler-32 checksum in accordance with RFC 1950 for length bytes of the bytes array starting from the offset index.

Conclusion

Just have these simple classes in mind when you will consider adding some compression capabilities to your application.

History

  • 23rd May, 2006: Initial post

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Andrey Dryazgov
Software Developer
Russian Federation Russian Federation
No Biography provided

Comments and Discussions

 
QuestionCan I use this Compression in Client and Decompression in WCF service. PinmemberManjunathabe0111-Feb-12 21:06 
First I would like to Thanks for this great article.
 
Can I implement this compression technique in Client and WCF service?. I means Compression in Client side and passing this compressed value to Service in service decompress the data and passing to BAL.
AnswerRe: Can I use this Compression in Client and Decompression in WCF service. PinmemberAndrey Dryazgov1-Feb-12 21:44 
GeneralRe: Can I use this Compression in Client and Decompression in WCF service. PinmemberManjunathabe01112-Feb-12 21:47 
GeneralRe: Can I use this Compression in Client and Decompression in WCF service. PinmemberAndrey Dryazgov12-Feb-12 22:03 
GeneralRe: Can I use this Compression in Client and Decompression in WCF service. PinmemberManjunathabe01112-Feb-12 22:38 
GeneralMy vote of 5 PinmemberDouglas Smallish22-Dec-11 13:21 
QuestionC++ porting Pinmemberbibopp24-Aug-11 21:33 
AnswerRe: C++ porting PinmemberAndrey Dryazgov24-Aug-11 23:00 
QuestionIs the code safe to use in a production environment ? PinmemberMoshe Fishman11-Nov-10 5:23 
AnswerRe: Is the code safe to use in a production environment ? PinmemberAndrey Dryazgov11-Nov-10 5:59 
GeneralRe: Is the code safe to use in a production environment ? PinmemberMoshe Fishman13-Nov-10 21:19 
GeneralRe: Is the code safe to use in a production environment ? PinmemberAndrey Dryazgov14-Nov-10 11:59 
Generalbig problem Pinmembermaorosh27-Oct-10 13:00 
GeneralRe: big problem PinmemberAndrey Dryazgov27-Oct-10 13:22 
Generaldecompress at silverlight Pinmemberreymond10-Aug-10 23:27 
GeneralRe: decompress at silverlight PinmemberAndrey Dryazgov11-Aug-10 1:35 
Generalexception when decompressing large in memory data 4096 bytes PinmemberShalamander19-Jun-10 11:29 
GeneralRe: exception when decompressing large in memory data 4096 bytes PinmemberAndrey Dryazgov19-Jun-10 16:50 
GeneralRe: exception when decompressing large in memory data 4096 bytes PinmemberShalamander20-Jun-10 16:12 
GeneralRe: exception when decompressing large in memory data 4096 bytes PinmemberShalamander20-Jun-10 16:14 
GeneralRe: exception when decompressing large in memory data 4096 bytes PinmemberShalamander20-Jun-10 18:45 
GeneralRe: exception when decompressing large in memory data 4096 bytes PinmemberAndrey Dryazgov20-Jun-10 18:42 
GeneralRe: exception when decompressing large in memory data 4096 bytes PinmemberShalamander20-Jun-10 18:49 
GeneralRe: exception when decompressing large in memory data 4096 bytes PinmemberAndrey Dryazgov20-Jun-10 19:58 
Generalwould be interesting to make this a drop-in replacement to DeflateStream. PinmemberCheeso22-Mar-09 10:37 
GeneralRe: would be interesting to make this a drop-in replacement to DeflateStream. PinmemberTiago Freitas Leal9-Dec-09 15:40 
GeneralLicense PinmemberCaetano88820-Nov-07 14:28 
GeneralRe: License PinmemberAndrey Dryazgov20-Nov-07 23:02 
GeneralRe: License PinmemberCaetano88821-Nov-07 3:44 
GeneralRe: License PinmemberJaime Olivares15-Jul-08 4:53 
GeneralRe: License PinmemberAndrey Dryazgov16-Jul-08 4:31 
QuestionSystem.OutOfMemoryException Pinmemberhemkumarsym5-Jun-07 3:41 
AnswerRe: System.OutOfMemoryException PinmemberAndrey Dryazgov5-Jun-07 3:59 
GeneralRe: System.OutOfMemoryException Pinmemberhemkumarsym5-Jun-07 4:52 
GeneralRe: System.OutOfMemoryException PinmemberAndrey Dryazgov5-Jun-07 5:11 
GeneralRe: System.OutOfMemoryException Pinmemberhemkumarsym5-Jun-07 19:55 
General64bit processor compatibility Pinmemberdfirka23-May-07 6:23 
GeneralRe: 64bit processor compatibility PinmemberAndrey Dryazgov23-May-07 6:34 
GeneralRe: 64bit processor compatibility PinmemberRi Qen-Sin20-Nov-07 16:05 
GeneralAn exception? [modified] Pinmemberschalling23-Mar-07 1:50 
GeneralRe: An exception? PinmemberAndrey Dryazgov23-Mar-07 2:31 
GeneralRe: An exception? PinmemberAndrey Dryazgov26-Mar-07 4:24 
GeneralRe: An exception? PinmemberJungoMungo20-Oct-07 10:07 
GeneralRe: An exception? PinmemberJungoMungo20-Oct-07 11:43 
GeneralRe: An exception? PinmemberAndrey Dryazgov20-Nov-07 23:07 
QuestionPossible to compress Stream? PinmemberrEgEn2k23-Dec-06 22:39 
AnswerRe: Possible to compress Stream? PinmemberAndrey Dryazgov24-Dec-06 4:13 
GeneralRe: Possible to compress Stream? PinmemberrEgEn2k24-Dec-06 10:13 
GeneralRe: Possible to compress Stream? PinmemberAndrey Dryazgov24-Dec-06 12:50 
GeneralRe: Possible to compress Stream? PinmemberrEgEn2k24-Dec-06 22:51 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Mobile
Web04 | 2.8.140902.1 | Last Updated 24 May 2006
Article Copyright 2006 by Andrey Dryazgov
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid