Article

# LZW Compression

By , 23 Apr 2004
 Rate this:
<!-- Article image -->

## Introduction

I've looked around the internet for an implementation of dynamic LZW compression in MFC. Unfortunately, since I couldn't find one, I've decided to write one myself. This article will not explain the basics of the LZW algorithm. To find that all you need to do is to look for "LZW algorithm" in google, and you'll find a lot information.

But I will explain what is the dynamic and static compression in the LZW implementations, and some problems that someone might face during implementing LZW algorithm.

The files LZWCompression.h, LZWCompression.cpp, Dictionary.h, Dictionary.cpp are all you need to use, to insert the LZW into your project.

## Background

### what is static compression in LZW?

The static compression is when we use a static number of bits to make the compression. The total number of bits we have to our disposal is 32, which gives us 4,294,967,295 possible entries in our dictionary. In most cases we don't get that far, actually, most of the times we won't get passed the 16 bit barrier (which gives us 65,535 possible entries in the dictionary). So in order to speed up the processing time, we set the number of bits used for compression to a limit (most applications put it on 14). That way every information written to the destination buffer, will be in the same bit size.

So if someone wants to represent the letter 'A' in the 14 bit format it will look like this: 00000001000001. If the dictionary have the value of 245 in its dictionary it will look like so: 00000011110101. As you can see, we have some leading 0 in the number, which means that the file can be compressed even more.

### what is dynamic compression in LZW?

The dynamic compression changes the number of bits used to compress the data. It starts with 9 bits for each new value, and goes up until it reaches 32 or until the file ends.

The number of bits to use in the dynamic compression is set by the size of the dictionary you have. Since the dictionary begins with all the 256 (0-255) ASCII codes. Once the dictionary reached the point were the next entry will have to have another bit to represent her (256, 512, ...) we change the bit size of the compression from that point. This also explains why we start with 9 bit compression.

Since the dynamic compression have only one leading 0 in front of each number, you get a more compressed file then the static method. But you pay in extra calculation time. for each entry you put in the dictionary, you have to calculate the number of bits to use.

So if someone wants to represent the letter 'A' it will look like this: 001000001.

If the dictionary have the value of 245 in its dictionary it will look like so: 011110101.
As you can see, we have one leading 0 in the number, and the number of bits to use is still 9.

If the dictionary have the value of 276 in its dictionary it will look like so: 0100010100. And the limit will now be 512, which means we are now dealing with 10 bits.

### Compressing the file

Compressing the file is simply saving all the bits in a buffer, and then writing them into the destination file in bytes. To accomplish that we can't just write the data into the file (writing the value 245 into the file will cause the number to look like this: 0000000011110101, which we don't want).

So we have to push the buffer data to the left, in order to make room for the new data to be inserted.

The number of bits we push the data is changing according to the number of bits we need for the data. After pushing the data in the buffer we add to it the data we want it to have. After doing so, we write some information into the destination file, and continue with the application. The following function receive the destination file to write in, and the new data to combine:

```void CLZWCompression::CompressData(CFile &dest, long toSave)
{
DWORD writeData = 0;

m_SavedData |= (DWORD) toSave << (32 - m_MaxBits - m_TotalBits);

m_TotalBits += m_MaxBits;

while (m_TotalBits >= 8)
{
//	Get the byte we want to write
writeData = m_SavedData;
writeData >>= 24;
dest.Write(&writeData, 1);

//	remove the byte from the buffer
m_SavedData <<= 8;
//	Remove the byte from the counter
m_TotalBits -= 8;
}
}
```

`m_MaxBits` holds the number of bits that represent the number held by `toSave`
`m_SavedData` is the buffer (DWORD) that holds the data to be written to the file, and `m_TotalBits` is the remaining bits in the buffer that represent a number.

Since we have to use bytes when we write in to files, and since the bit size we are using changes, we remain with bits in the buffer of a previous data. This is the main reason for the use of `m_TotalBits` to prevent from us to write information on the previous number information, and to make sure that all the data will reach the file correctly.

### Decompressing the file

Decompressing the file is to take all the data we had from the previous compression, and translate it into sequences of information that we can understand.

In the decompression we take the code we had in the file, and using the dictionary (which we build again while decompressing), we get the sequence of letters that construct this code.

The decode buffer is filled in the opposite way, meaning from the end byte that needs to be written, to the beginning byte of the sequence.

```void CDictionary::GetBytesFromCode(CByteArray *Buffer, DWORD code)
{
dicElement *tmpEl = NULL;

while (code > 255)
{
tmpEl = (dicElement*)GetAt(code - 256);
code = tmpEl->m_Prefix;
}
}
```

The buffer then is written into the file when the letters are written from the last to end.

```while (counter > 0)
{
writeData = (BYTE)decodeString.GetAt(--counter);
destination.Write(&writeData, 1);
}

decodeString.RemoveAll();
```

Unlike in the compression, we can't stop the decompression process when the file is over, since we still have information inside the buffer that needs to be decoded. So to end the file, in the compression we entered at the end of the file the maximum number possible by the current bit size.

```void CLZWCompression::CloseCompressedFile(CFile &source)
{
CompressData(source, m_MaxCode[m_MaxBits]);
CompressData(source, 0);
}
```

We flash the end after the `m_MaxCode[m_MaxBits]` with '0' to make sure that the file will contain all the code. Since the '0' after it will never be used. This value is compared with the currently read data, and if exist means that the end of file arrived, and that we need to end the decompression process.

```while ((data = DecompressData(source)) != m_MaxCode[m_MaxBits])
```

## Using the code

All you need to use this code, is just to insert the code into your project. The class creates its own dictionary, and clears it after the use of the compression or decompression process, so using one class to manage all your LZW compression in the project is easy.

Unfortunately, when using this class in Thread, you must make a new instance of the class for every thread you are running, since the class can have only one dictionary at a time.

## Code History

• 25 Apr 2004 - The code was added with `convertASCIIToText `function for showing the code actual character in decompression, as suggested by WREY. The previous log string was commented and a new one is in use.

A list of licenses authors might use can be found here

Web Developer
United States
No Biography provided

 First PrevNext
 LZW Compress and Decompress NDL2204 23-Feb-12 15:40
 What type of output mosselios 4-Feb-12 20:43
 I want to ask about the output...what type of output you use? i use this text as input : "Lorem Ipsum Dolor Sit Amet. Lorem Ipsum Dolor Sit Amet. Lorem Ipsum Dolor Sit Amet. Lorem Ipsum Dolor Sit Amet. Lorem Ipsum Dolor Sit Amet. Lorem Ipsum Dolor Sit Amet. Lorem Ipsum Dolor Sit Amet. Lorem Ipsum Dolor Sit Amet. Lorem Ipsum Dolor Sit Amet. Lorem Ipsum Dolor Sit Amet. Lorem Ipsum Dolor Sit Amet. Lorem Ipsum Dolor Sit Amet." in "text.txt" files and compress it, then it create "text.lzw" self that contain this output (if i opened it with microsoft wordpad): txt&ÎFSh€’p9`„Cy² )šN‚     ´Êt (\$†C¢(¤Z1Á`ð‘.r‡ÄbqX¼f(ŽÊå²Œ’i'ŽJ£òé„Žg&›P#ÒÉ¾E2’ÍcršTê›<£Tfô]=£Ô§)Ý¡?©Îi”J|ú‘h±Õ¬¶Ûn«k¯ÿÀ i try to make lzw but the output still in character level (8 bit /char), with the same input my program generate this output : 25:E:11:4:C:3E:22:F:12:14:63:1D:E:B:60:3E:2C:8:13:3E:1A:C:4:13:4C:3E:5F:61:63:65:67:69:6B:6D:6F:71:73:75:77:79:62:64:66:68:3E:6A:6C:11:6E:70:72:74:76:78:60:87:7C:8A:8C:80:90:83:93:86:7B:89:7E:8D:8F:82:92:85:95:9F:7D:8B:7F:8E:81:91:84:94:7A:88:A9:99:AC:9B:A5:B0:96:A0:AA:A2:AD:9C:A6:B1:97:A1:9A:A4:AF:9E:B2:98:AB:A3:AE:9D:A7:C7:C2:B5:C4:CC:C0:BA:B4:CA:BE The idea is every hex code in this output is hex representation from decimal number of dictionary indeks, i'm stuck in this problem, can you give me some advice, what type of output that should i use? then i will search, browse, and learn how to change my lzw output to type of output from your advice... thanks Best regardsmosselios evanInstitut Teknologi Bandung, Indonesia
 LZW compression and decompression in MATLAB. albinbenny 15-Sep-09 3:54
 Can it compress a short string into a shorter string? chocm 27-Dec-08 1:55
 LZW compression and decompression manekantan 9-Dec-08 21:26
 help me please sam s. samaan 21-Oct-08 18:38
 Re: help me please bill_wang 15-Mar-09 15:48
 CStringArray variable to CString variable ciiiek 9-Jan-08 17:00
 LZW compress file caeBAIK 20-Mar-06 3:19
 Re: LZW compress file Hatem Mostafa 5-Jan-07 10:27
 Small Bug Danoraky 7-Mar-06 8:02
 do it with lzw-compressed pdf streams privet 25-Nov-05 8:49
 Race Condition? rocksoft 16-May-05 5:28
 how to zip a folder nkshiv 17-Mar-05 22:14
 Re: how to zip a folder The Prince of Chaos 17-Mar-05 22:23
 Re: how to zip a folder Anonymous 18-Mar-05 19:12
 Re: how to zip a folder nicole_y 21-Jul-05 1:08
 God, but very slow leandrobecker 28-Dec-04 11:41
 what would you recomend... Kith_Kahnan 19-Nov-04 12:13
 got some questions raars 29-Sep-04 22:19
 Re: got some questions The Prince of Chaos 2-Oct-04 8:56
 How to Compress Image using LZW Compression. pubba 20-Jul-04 0:12
 Re: How to Compress Image using LZW Compression. oseman 23-May-05 8:19
 Re: How to Compress Image using LZW Compression. pubududilena 24-May-05 17:49
 Last Visit: 31-Dec-99 18:00     Last Update: 19-Apr-14 7:09 Refresh 12 Next »