11,928,352 members (45,505 online)
alternative version

171.5K views
37 bookmarked
Posted

# LZW Compression

, 23 Apr 2004
 Rate this:
This will show the simple and useful way to implement a compression algorithm in MFC
<!-- Article image -->

## Introduction

I've looked around the internet for an implementation of dynamic LZW compression in MFC. Unfortunately, since I couldn't find one, I've decided to write one myself. This article will not explain the basics of the LZW algorithm. To find that all you need to do is to look for "LZW algorithm" in google, and you'll find a lot information.

But I will explain what is the dynamic and static compression in the LZW implementations, and some problems that someone might face during implementing LZW algorithm.

The files LZWCompression.h, LZWCompression.cpp, Dictionary.h, Dictionary.cpp are all you need to use, to insert the LZW into your project.

## Background

### what is static compression in LZW?

The static compression is when we use a static number of bits to make the compression. The total number of bits we have to our disposal is 32, which gives us 4,294,967,295 possible entries in our dictionary. In most cases we don't get that far, actually, most of the times we won't get passed the 16 bit barrier (which gives us 65,535 possible entries in the dictionary). So in order to speed up the processing time, we set the number of bits used for compression to a limit (most applications put it on 14). That way every information written to the destination buffer, will be in the same bit size.

So if someone wants to represent the letter 'A' in the 14 bit format it will look like this: 00000001000001. If the dictionary have the value of 245 in its dictionary it will look like so: 00000011110101. As you can see, we have some leading 0 in the number, which means that the file can be compressed even more.

### what is dynamic compression in LZW?

The dynamic compression changes the number of bits used to compress the data. It starts with 9 bits for each new value, and goes up until it reaches 32 or until the file ends.

The number of bits to use in the dynamic compression is set by the size of the dictionary you have. Since the dictionary begins with all the 256 (0-255) ASCII codes. Once the dictionary reached the point were the next entry will have to have another bit to represent her (256, 512, ...) we change the bit size of the compression from that point. This also explains why we start with 9 bit compression.

Since the dynamic compression have only one leading 0 in front of each number, you get a more compressed file then the static method. But you pay in extra calculation time. for each entry you put in the dictionary, you have to calculate the number of bits to use.

So if someone wants to represent the letter 'A' it will look like this: 001000001.

If the dictionary have the value of 245 in its dictionary it will look like so: 011110101.
As you can see, we have one leading 0 in the number, and the number of bits to use is still 9.

If the dictionary have the value of 276 in its dictionary it will look like so: 0100010100. And the limit will now be 512, which means we are now dealing with 10 bits.

### Compressing the file

Compressing the file is simply saving all the bits in a buffer, and then writing them into the destination file in bytes. To accomplish that we can't just write the data into the file (writing the value 245 into the file will cause the number to look like this: 0000000011110101, which we don't want).

So we have to push the buffer data to the left, in order to make room for the new data to be inserted.

The number of bits we push the data is changing according to the number of bits we need for the data. After pushing the data in the buffer we add to it the data we want it to have. After doing so, we write some information into the destination file, and continue with the application. The following function receive the destination file to write in, and the new data to combine:

```void CLZWCompression::CompressData(CFile &dest, long toSave)
{
DWORD writeData = 0;

m_SavedData |= (DWORD) toSave << (32 - m_MaxBits - m_TotalBits);

m_TotalBits += m_MaxBits;

while (m_TotalBits >= 8)
{
//	Get the byte we want to write
writeData = m_SavedData;
writeData >>= 24;
dest.Write(&writeData, 1);

//	remove the byte from the buffer
m_SavedData <<= 8;
//	Remove the byte from the counter
m_TotalBits -= 8;
}
}```

`m_MaxBits` holds the number of bits that represent the number held by `toSave`
`m_SavedData` is the buffer (DWORD) that holds the data to be written to the file, and `m_TotalBits` is the remaining bits in the buffer that represent a number.

Since we have to use bytes when we write in to files, and since the bit size we are using changes, we remain with bits in the buffer of a previous data. This is the main reason for the use of `m_TotalBits` to prevent from us to write information on the previous number information, and to make sure that all the data will reach the file correctly.

### Decompressing the file

Decompressing the file is to take all the data we had from the previous compression, and translate it into sequences of information that we can understand.

In the decompression we take the code we had in the file, and using the dictionary (which we build again while decompressing), we get the sequence of letters that construct this code.

The decode buffer is filled in the opposite way, meaning from the end byte that needs to be written, to the beginning byte of the sequence.

```void CDictionary::GetBytesFromCode(CByteArray *Buffer, DWORD code)
{
dicElement *tmpEl = NULL;

while (code > 255)
{
tmpEl = (dicElement*)GetAt(code - 256);
code = tmpEl->m_Prefix;
}
}```

The buffer then is written into the file when the letters are written from the last to end.

```while (counter > 0)
{
writeData = (BYTE)decodeString.GetAt(--counter);
destination.Write(&writeData, 1);
}

decodeString.RemoveAll();```

Unlike in the compression, we can't stop the decompression process when the file is over, since we still have information inside the buffer that needs to be decoded. So to end the file, in the compression we entered at the end of the file the maximum number possible by the current bit size.

```void CLZWCompression::CloseCompressedFile(CFile &source)
{
CompressData(source, m_MaxCode[m_MaxBits]);
CompressData(source, 0);
}```

We flash the end after the `m_MaxCode[m_MaxBits]` with '0' to make sure that the file will contain all the code. Since the '0' after it will never be used. This value is compared with the currently read data, and if exist means that the end of file arrived, and that we need to end the decompression process.

`while ((data = DecompressData(source)) != m_MaxCode[m_MaxBits])`

## Using the code

All you need to use this code, is just to insert the code into your project. The class creates its own dictionary, and clears it after the use of the compression or decompression process, so using one class to manage all your LZW compression in the project is easy.

Unfortunately, when using this class in Thread, you must make a new instance of the class for every thread you are running, since the class can have only one dictionary at a time.

## Code History

• 25 Apr 2004 - The code was added with `convertASCIIToText `function for showing the code actual character in decompression, as suggested by WREY. The previous log string was commented and a new one is in use.

A list of licenses authors might use can be found here

## Share

 Web Developer United States
No Biography provided

## You may also be interested in...

 First Prev Next
 LZW Compress and Decompress NDL220423-Feb-12 16:40 NDL2204 23-Feb-12 16:40
 Hi. Can you tell me, now I want to contact the author write the program in the illustration above, because I have a problem relating to compress and decompress files such as. Doc file. Xls , file. pdf and some other file. So who is the author of the program on the application for contacts, and help me solve my problem. Please thank the authors and thank everyone. I would send the pictures to show you, this is the error in image decompression process the data file. http://www.mediafire.com/i/?123ec0dcrwqt2hl http://www.mediafire.com/i/?1b3v05i9tdy3751 http://www.mediafire.com/i/?yjw2uhiuyqmj3xu Goodbye and see you. Hope your answer.
 What type of output mosselios4-Feb-12 21:43 mosselios 4-Feb-12 21:43
 LZW compression and decompression in MATLAB. albinbenny15-Sep-09 4:54 albinbenny 15-Sep-09 4:54
 Can it compress a short string into a shorter string? chocm27-Dec-08 2:55 chocm 27-Dec-08 2:55
 Re: Can it compress a short string into a shorter string? apex_tyf21-Apr-11 5:38 apex_tyf 21-Apr-11 5:38
 LZW compression and decompression manekantan9-Dec-08 22:26 manekantan 9-Dec-08 22:26
 help me please sam s. samaan21-Oct-08 19:38 sam s. samaan 21-Oct-08 19:38
 Re: help me please bill_wang15-Mar-09 16:48 bill_wang 15-Mar-09 16:48
 CStringArray variable to CString variable ciiiek9-Jan-08 18:00 ciiiek 9-Jan-08 18:00
 LZW compress file caeBAIK20-Mar-06 4:19 caeBAIK 20-Mar-06 4:19
 Re: LZW compress file Hatem Mostafa5-Jan-07 11:27 Hatem Mostafa 5-Jan-07 11:27
 Small Bug Danoraky7-Mar-06 9:02 Danoraky 7-Mar-06 9:02
 do it with lzw-compressed pdf streams privet25-Nov-05 9:49 privet 25-Nov-05 9:49
 Race Condition? rocksoft16-May-05 6:28 rocksoft 16-May-05 6:28
 how to zip a folder nkshiv17-Mar-05 23:14 nkshiv 17-Mar-05 23:14
 Re: how to zip a folder The Prince of Chaos17-Mar-05 23:23 The Prince of Chaos 17-Mar-05 23:23
 Re: how to zip a folder Anonymous18-Mar-05 20:12 Anonymous 18-Mar-05 20:12
 Re: how to zip a folder nicole_y21-Jul-05 2:08 nicole_y 21-Jul-05 2:08
 God, but very slow leandrobecker28-Dec-04 12:41 leandrobecker 28-Dec-04 12:41
 what would you recomend... Kith_Kahnan19-Nov-04 13:13 Kith_Kahnan 19-Nov-04 13:13
 got some questions raars29-Sep-04 23:19 raars 29-Sep-04 23:19
 Re: got some questions The Prince of Chaos2-Oct-04 9:56 The Prince of Chaos 2-Oct-04 9:56
 How to Compress Image using LZW Compression. pubba20-Jul-04 1:12 pubba 20-Jul-04 1:12
 Re: How to Compress Image using LZW Compression. oseman23-May-05 9:19 oseman 23-May-05 9:19
 Re: How to Compress Image using LZW Compression. pubududilena24-May-05 18:49 pubududilena 24-May-05 18:49
 image article???? moeen ali5-Jun-04 11:31 moeen ali 5-Jun-04 11:31
 Re: image article???? The Prince of Chaos5-Jun-04 18:47 The Prince of Chaos 5-Jun-04 18:47
 Suggestion. WREY25-Apr-04 14:13 WREY 25-Apr-04 14:13
 Re: Suggestion. The Prince of Chaos25-Apr-04 20:23 The Prince of Chaos 25-Apr-04 20:23
 Re: Suggestion. The Prince of Chaos25-Apr-04 20:45 The Prince of Chaos 25-Apr-04 20:45
 Article Image? A. Riazi24-Apr-04 17:55 A. Riazi 24-Apr-04 17:55
 Re: Article Image? John M. Drescher24-Apr-04 19:25 John M. Drescher 24-Apr-04 19:25
 Last Visit: 31-Dec-99 19:00     Last Update: 28-Nov-15 6:20 Refresh 1