Introduction
BMP file format is among the first ones that were designed for storing digital image bitmap raster (there is also PCX). BMP like PCX can be RLE compressed, which can be useful or not depending on information stored in raster row. If raster contains many horizontal repeats, RLE is useful, in other case, it's just a waste of time & space. See Wiki http://en.wikipedia.org/wiki/BMP_file_format for bitmap file format, or go directly to http://en.wikipedia.org/wiki/Run-length_encoding, if you are interested in RLE compression more.
Background
The code presented in this article is fully C++, and uses standard I/O operations. To reimplement it into full stream buffer based model, one must deal with the filesize to be written after compression into header, because header is before compressed stream, the size of which is not known before compression. The code deals with 4 and 8 bit bitmaps, because it was defined so, that 1,24 and 32 bit bitmaps cannot be compressed.
The Code
You need to include only rle_bmp_compress.cpp file.
class RLE_Win_Compress{
public:
bool RLE_Compress(char *input, char *output);
};
Call only one public
function that is here, pass input and output file names in char
s.
The compression is really simple and you should find everything needed to understand it from comments in the file(s).
How the Stuff Works
RLE in BMPs
- Take input pixel
- Determine run length (run of repeats)
- Write run length (repeats count)
- Write pixel
- If End of Scanline, write 0x0000 if not last scanline, else write 0x0001 to mark EOF
Perhaps, it would be a good idea to describe BMP file format here, as I practically verified.
- Compressed bitmaps contain no padding, but have to contain 0x0000 at the end of each scanline (EOL) and 0x0001 at the end of the file (EOF), where precedence takes EOF marker, so after the last scanline, there is only EOF marker, not EOL.
- Uncompressed bitmaps, instead, do contain padding bytes, to pad scanlines into nearest 4 bytes boundary. They, however, do not contain any markers EOF nor EOL
- Compressed bitmaps can be Absolute or encoded mode. (I use absolute mode such: BYTE count BYTE index.
- 4 bit bitmaps run lengths can be up to 127 and 8 bit up to 255. This is given by the fact, that 4 bit bitmaps are actually 2 indices per byte, but compressed as one byte, thus first 127x and second 127x makes 254 together.
8 Bit BMP Compression Implementation
We need to read headers. There are 2 of them. BITMAPFILEHEADER
and BITMAPINFOHEADER
. Read them in this order. Then check for BITMAPINFOHEADER
's values. They have to be set:
biSize = 0x28
biBitCount = 8 or 4
biCompression = 0
Check BITMAPFILEHEADER
for bfType = 0x4d42
.
Read bitmap palette:
PALETTEENTRY *paletteentries;
paletteentries = new PALETTEENTRY[ 256 ];
fread(paletteentries, sizeof(PALETTEENTRY), 256, f);
Next set:
- Line width
- Input width
- Padding
int LineWidth = bitmapinfoheader.biWidth;
int InputWidth = LineWidth;
int padding = (bitmapinfoheader.biWidth % 4);
But we are not done with padding yet. We need to set up padding again. When we have 7 pixels wide bitmap, 7 % 4 would return 3. 3+7 = 10. Thus we need to do:
if (padding != 0) padding = 4-padding;
where 4-3 = 1, finally gives 1+7 = 8, which is correct boundary alignment, because 8 is a multiple of 4.
Allocate space for input scanline and for 2D array, to store compressed scanlines lengths:
BYTE *Input = new BYTE[LineWidth];
int *LineLengths = new int[bitmapinfoheader.biHeight];
Allocate output buffer. This output buffer is 2D array in fact.
BYTE **Output = (BYTE **) malloc( bitmapinfoheader.biHeight * sizeof(BYTE *));
for (int i = 0; i < bitmapinfoheader.biHeight; i++){
Output[i] = (BYTE *) malloc(((bitmapinfoheader.biWidth*2)+8) * sizeof(BYTE *));
Output[i][bitmapinfoheader.biWidth+1] = 0; Output[i][bitmapinfoheader.biWidth+2] = 0; Output[i][bitmapinfoheader.biWidth+3] = 0; Output[i][bitmapinfoheader.biWidth+4] = 0;
Output[i][bitmapinfoheader.biWidth+5] = 0;
Output[i][bitmapinfoheader.biWidth+6] = 0;
Output[i][bitmapinfoheader.biWidth+7] = 0;
LineLengths[i] = 0;
}
Determine output filesize (for now without compressed stream):
int filesize = (sizeof(PALETTEENTRY)*256) + sizeof(bitmapfileheader) +
sizeof(bitmapinfoheader);
Not too much to say about the compression. Just read each scanline, beware paddings and fill
up our 2D compressed buffer. Write aside line lengths as well:
int headersize = filesize;
register int Y, X;
int j;
int cnt;
bool uncompressed;
BYTE pixel;
for (Y = 0; Y < bitmapinfoheader.biHeight; Y++){
fread(Input, bitmapinfoheader.biWidth, 1, f);
for (X = 0; X < padding; X++)
fgetc(f); uncompressed = true;
X = 0;
j = 0;
LineLengths[Y] = 0;
while(uncompressed) {
cnt = 0; pixel = Input[j]; while ( (j < InputWidth) && (pixel == Input[j]) && (cnt < 255)){
j++; cnt++; }
if (j == InputWidth)
uncompressed = false; Output[Y][X] = cnt; X++; Output[Y][X] = pixel; X++; LineLengths[Y] = LineLengths[Y] +2; filesize = filesize +2; }
Output[Y][X] = 0x00; if (Y == bitmapinfoheader.biHeight-1)
Output[Y][X+1] = 0x01; else
Output[Y][X+1] = 0x00; LineLengths[Y]+=2; filesize+=2; }
bitmapinfoheader.biCompression =1; bitmapfileheader.bfSize = filesize; bitmapinfoheader.biSizeImage = filesize - headersize; fwrite(&bitmapfileheader, sizeof(bitmapfileheader), 1, fout);
fwrite(&bitmapinfoheader, sizeof(bitmapinfoheader), 1, fout);
fwrite(paletteentries, sizeof(PALETTEENTRY), 256, fout);
for (Y = 0; Y < bitmapinfoheader.biHeight; Y++){
fwrite(Output[Y], LineLengths[Y], 1, fout); }
for (int i = 0; i < bitmapinfoheader.biHeight; i++)
free(Output[i]);
free(Output);
delete Input;
delete LineLengths;
delete paletteentries;
Points of Interest
Because this is not a very efficient compression (which RLE naturally is not), you may consider it useless. Over PNG and GIF, RLE compression is simple to implement making RLE bitmaps useful for long repeats, such as schemes, diagrams or text pictures.
Perhaps worth consideration, that although original concept of RLE and RLE in 8 bit bitmaps does not allow for longer than 1 byte sequences, 4 bit bitmaps can be of exception. Their pixel data is stored 2 pixels per byte, thus encoded as one byte in RLE.
Why All This?
Few reasons:
- I knew that I could do it
- It is relatively easy to understand and to implement
- The idea, that I code it, and other program can read it and display it
History
- 29.08.2010 - First publication, comments and class encapsulation