RLE: The Human Friendly Compression
RLE (Run Length Encoding): The Human Friendly Compression
Background
Every time one wants to save space for data storage, compression is the answer. RLE is a very simple compression algorithm.
The xPrompt Download and prg Files
xPrompt.zip contains the binary of the xHarbour Scripting Engine for Windows. This is the app needed to run the prg code.
Usage:
- Copy prg file in the same directory as xPrompt
- Launch xPrompt
- Type "do FileName.prg" to run the code
xHarbour is freeware and can be downloaded at xHarbour.org[^].
xHarbour is part of the xBase family languages: xBase - Wikipedia[^].
Contents
- Introduction
- The RLE Algorithm
- Conway's Game of Life
- Sokoban: A Practical Case
- Points of Interest
- Links
- History
1. Introduction
RLE means 'Run Length Encoding', this compression method focuses on repeats (runs) of same value to achieve compression. This also means that it is efficient only on well suited datasets. As it is really simple, encoding and decoding can be done by hand.
2. The RLE Algorithm
RLE is a lossless data compression algorithm applied to text. The principle is to replace runs of same data by the count and the data value only once. The operation is interesting for runs of 3 or more repeats. Since we use digits to encode runs, when input include digits, a translation must be done to avoid conflict between data and counts.
RLE is interesting every time one gets large bitmaps with long runs of same data:
- Faxes: RLE is in world wide use with faxes, because a fax is a drawing made of dots that are only black or white and with large runs of same color.
- Games levels: Games with 2D map levels like Sokoban are good candidates because at any point in the game, only the current level is needed in uncompressed form.
- Large dataset: Conway's Game of Life is a good example of large datsets with only two values, and dominated by dead cells.
Example
In this example, bitmap is the input, on the right is the RLE encoding for each line. Below is the full RLE. And last is the size of input and the size of RLE. The output is not optimized.
Bitmap Letter A
Bitmap RLE
...................... 22.$
...................... 22.$
.......xxxxxxx........ 7.7x8.$
.....xxxxxxxxxxx...... 5.11x6.$
....xxxxxxxxxxxxx..... 4.13x5.$
....xxxx......xxxx.... 4.4x6.4x4.$
...xxxx........xxx.... 3.4x8.3x4.$
...xxx.........xxx.... 3.3x9.3x4.$
...............xxx.... 15.3x4.$
..............xxxx.... 14.4x4.$
......xxxxxxxxxxxx.... 6.12x4.$
....xxxxxxxxxxxxxx.... 4.14x4.$
...xxxxxxxxx...xxx.... 3.9x3.3x4.$
..xxxxx........xxx.... 2.5x8.3x4.$
..xxx..........xxx.... 2.3x10.3x4.$
..xxx.........xxxx.... 2.3x9.4x4.$
..xxx.........xxxx.... 2.3x9.4x4.$
..xxxx......xxxxxx.... 2.4x6.6x4.$
...xxxxxxxxxxx.xxxxx.. 3.11x.5x2.$
....xxxxxxxxx...xxxx.. 4.9x3.4x2.$
.....xxxxxx.....xxxx.. 5.6x5.4x2.$
...................... 22.$
...................... 22.!
22.$22.$7.7x8.$5.11x6.$4.13x5.$4.4x6.4x4.$3.4x8.3x4.$3.3x9.3x4.$15.3x4.$14.4x
4.$6.12x4.$4.14x4.$3.9x3.3x4.$2.5x8.3x4.$2.3x10.3x4.$2.3x9.4x4.$2.3x9.4x4.$2.4x
6.6x4.$3.11x.5x2.$4.9x3.4x2.$5.6x5.4x2.$22.$22.!
Bitmap size= 506 RLE size= 204
RLE Translation table
Pattern | Encoding | Meaning |
. (dot) | . (dot) | White point |
x | x | Black point |
Digits | Number of repeats | |
$ | End of Line | |
! | End of Pattern |
Encoding
In RLE.zip, program file is RLE_Enc_fax.prg, input pattern is in file Fax.txt, result is in file Fax.log.
* RLE encoding for Simple Bitmap
clear screen
/* Data file structure
pattern Name on first line
Pattern
empty line to indicate the end of pattern
*/
mdl= {}
* Read input file
handle= fopen("Fax.txt")
tmp= freadln(handle, ,1024)
do while len(tmp) > 0
tmp= left(tmp,len(tmp)-2) // remove carriage return
aadd(mdl, tmp)
tmp= freadln(handle, ,1024)
enddo
fclose(handle)
* process input file
set printer to Fax.log
set printer on
title= .T.
for scan=1 to len(mdl)
? mdl[scan] // output pattern name
rle= ""
bitmap= 0
scan++
do while len(mdl[scan]) != 0
? mdl[scan] // output pattern line
rle_line= ""
bitmap += len(mdl[scan])
last= mdl[scan,1]
cnt= 1
for scanc=2 to len(mdl[scan])
if last= mdl[scan, scanc]
cnt++
else
if cnt> 1
rle_line += str(cnt,,, .t.)
endif
rle_line += last
last= mdl[scan, scanc]
cnt= 1
endif
next
* if last != "."
if cnt> 1
rle_line += str(cnt,,, .t.)
endif
rle_line += last
* endif
scan++
if len(mdl[scan]) = 0
rle_line += "!"
else
rle_line += "$"
endif
?? " "
?? rle_line // output line encoding
rle += rle_line
enddo
? rle // output full RLE
? "Bitmap size= "+str(bitmap,,, .t.)
?? " RLE size= "+str(len(rle),,, .t.)
?
next
set printer off
set printer to
return
Decoding
In RLE.zip, program file is RLE_Dec_fax.prg.
* RLE decoding for simple Bitmap
clear screen
pattern= "22.$22.$7.7x8.$5.11x6.$4.13x5.$4.4x6.4x4.$3.4x8.3x4.$3.3x9.3x
4.$15.3x4.$14.4x4.$6.12x4.$4.14x4.$3.9x3.3x4.$2.5x8.3x4.$2.3x10.3x
4.$2.3x9.4x4.$2.3x9.4x4.$2.4x6.6x4.$3.11x.5x2.$4.9x3.4x2.$5.6x5.4x2.$22.$22.!"
* process RLE pattern
cnt=0
for scan=1 to len(pattern)
char= pattern[scan]
if isdigit(char)
cnt= cnt*10+ val(char)
else
if char= "$" .or. char= "!"
? // next line
else
if cnt= 0
cnt=1
endif
for rep= 1 to cnt
? char
next
endif
cnt= 0
endif
next
?
return
3. Conway's Game of Life
GoL patterns fit very well with RLE compression because datasets are large, a cell is either dead or alive, and dead cells dominate datasets.
Run Length Encoded - LifeWiki[^]
Hundred of RLE compressed patterns can be found at Category:Patterns - LifeWiki[^].
Example
2-engine Cordership
...................OO.................... 19b2O$
...................OOOO.................. 19b4O$
...................O.OO.................. 19bOb2O$
......................................... $
....................O.................... 20bO$
...................OO.................... 19b2O$
...................OOO................... 19b3O$
.....................O................... 21bO$
.................................OO...... 33b2O$
.................................OO...... 33b2O$
......................................... $
......................................... $
......................................... $
......................................... $
......................................... $
......................................... $
....................................O.... 36bO$
...................................OO.... 35b2O$
..................................O...O.. 34bO3bO$
...................................OO..O. 35b2O2bO$
........................................O 40bO$
.....................................O.O. 37bObO$
......................................O.. 38bO$
......................................O.. 38bO$
......................................OO. 38b2O$
......................................OO. 38b2O$
......................................... $
......................................... $
.............O..........O................ 13bO10bO$
............OOOOO.....O.OO...........O... 12b5O5bOb2O11bO$
...........O..........O...O.........O.... 11bO10bO3bO9bO$
............OO........OOO.O.........OO... 12b2O8b3ObO9b2O$
.............OO.........OO............O.. 13b2O9b2O12bO$
OO.............O.....................OOO. 2O13bO21b3O$
OO...................................OOO. 2O35b3O$
......................................... $
......................................... $
......................................... $
......................................... $
......................................... $
......................................... $
........OO............................... 8b2O$
........OO...........OO.................. 8b2O11b2O$
...................OO..O................. 19b2O2bO$
........................O...O............ 24bO3bO$
..................O.....O...O............ 18bO5bO3bO$
...................O..OO...O.O........... 19bO2b2O3bObO$
....................OOO.....O............ 20b3O5bO$
............................O............ 28bO!
19b2O$19b4O$19bOb2O$$20bO$19b2O$19b3O$21bO$33b2O$33b2O$$$$$$$36bO$35b2O$3
4bO3bO$35b2O2bO$40bO$37bObO$38bO$38bO$38b2O$38b2O$$$13bO10bO$12b5O5bOb2O1
1bO$11bO10bO3bO9bO$12b2O8b3ObO9b2O$13b2O9b2O12bO$2O13bO21b3O$2O35b3O$$$$$
$$8b2O$8b2O11b2O$19b2O2bO$24bO3bO$18bO5bO3bO$19bO2b2O3bObO$20b3O5bO$28bO!
Bitmap size= 2009 RLE size= 292
Result without end of line optimization. Pay attention to the difference in RLE size.
19b2O20b$19b4O18b$19bOb2O18b$41b$20bO20b$19b2O20b$19b3O19b$21bO19b$33b2O6b$
33b2O6b$41b$41b$41b$41b$41b$41b$36bO4b$35b2O4b$34bO3bO2b$35b2O2bOb$40bO$37b
ObOb$38bO2b$38bO2b$38b2Ob$38b2Ob$41b$41b$13bO10bO16b$12b5O5bOb2O11bO3b$11bO
10bO3bO9bO4b$12b2O8b3ObO9b2O3b$13b2O9b2O12bO2b$2O13bO21b3Ob$2O35b3Ob$41b$41
b$41b$41b$41b$41b$8b2O31b$8b2O11b2O18b$19b2O2bO17b$24bO3bO12b$18bO5bO3bO12b
$19bO2b2O3bObO11b$20b3O5bO12b$28bO12b!
Bitmap size= 2009 RLE size= 413
RLE Translation Table
Pattern | Encoding | Meaning |
. (dot) | b | Dead cell |
O | o | Alive cell |
Digits | Number of repeats | |
$ | End of Line | |
! | End of Pattern |
Since playfield defaults to dead cells, dead cells ending a line can be removed.
Encoding
In RLE.zip, program file is RLE_Enc_gol.prg, input pattern is in file Gol.txt, result is in file Gol.log.
* RLE encoding for Conway's Game of Life
clear screen
/* Data file structure
GoL pattern Name on first line
Pattern
empty line to indicate the end of pattern
*/
mdl= {}
* Read input file
handle= fopen("gol.txt")
tmp= freadln(handle, ,1024)
do while len(tmp) > 0
tmp= left(tmp,len(tmp)-2) // remove carriage return
aadd(mdl, tmp)
tmp= freadln(handle, ,1024)
enddo
fclose(handle)
* process input file
set printer to GoL.log
set printer on
title= .T.
for scan=1 to len(mdl)
? mdl[scan] // pattern name
rle= ""
bitmap= 0
scan++
do while len(mdl[scan]) != 0
? mdl[scan] // output pattern line
rle_line= ""
bitmap += len(mdl[scan])
last= mdl[scan,1]
cnt= 1
for scanc=2 to len(mdl[scan])
if last= mdl[scan, scanc]
cnt++
else
if cnt> 1
rle_line += str(cnt,,, .t.)
endif
if last= "."
rle_line += "b" // translation
else
rle_line += last
endif
last= mdl[scan, scanc]
cnt= 1
endif
next
if last != "." // EOL optimization
if cnt> 1
rle_line += str(cnt,,, .t.)
endif
if last= "."
rle_line += "b" // translation
else
rle_line += last
endif
endif
scan++
if len(mdl[scan]) = 0
rle_line += "!"
else
rle_line += "$"
endif
?? " "
?? rle_line // output line encoding
rle += rle_line
enddo
? rle // output full RLE
? "Bitmap size= "+str(bitmap,,, .t.)
?? " RLE size= "+str(len(rle),,, .t.)
?
next
set printer off
set printer to
return
Decoding
In RLE.zip, program file is RLE_Dec_Gol.prg and Gol.cpp.
* RLE decoding for simple Bitmap
clear screen
pattern= "19b2O$19b4O$19bOb2O$$20bO$19b2O$19b3O$21bO$33b2O$33b2O$$$$$$$36bO
$35b2O$34bO3bO$35b2O2bO$40bO$37bObO$38bO$38bO$38b2O$38b2O$$$13bO10bO$12b5O5
bOb2O11bO$11bO10bO3bO9bO$12b2O8b3ObO9b2O$13b2O9b2O12bO$2O13bO21b3O$2O35b3O$
$$$$$$8b2O$8b2O11b2O$19b2O2bO$24bO3bO$18bO5bO3bO$19bO2b2O3bObO$20b3O5bO$28bO!"
* process RLE pattern
cnt= 0
for scan=1 to len(pattern)
char= pattern[scan]
if isdigit(char)
cnt= cnt*10+ val(char)
else
if char= "$" .or. char= "!"
? // next line
else
if cnt= 0
cnt=1
endif
if char= "b"
char= "."
endif
for rep= 1 to cnt
?? char
next
endif
cnt= 0
endif
next
?
return
' Sample program for HP Prime calculator
RLEMap (px, py, RLE)
' Game of Life RLE decoding
BEGIN
LOCAL Row, Col, Ptr, Cnt, Cod;
Row:= py; Col:= px+2; Cnt:= 0;
FOR Ptr FROM 1 TO DIM(RLE) DO
Cod:= MID(RLE, Ptr, 1);
CASE
IF INSTRING("0123456789", Cod) <> 0 THEN Cnt:= Cnt*10+EXPR(Cod); END;
IF Cod == "$" THEN Row:= Row+MAX(1, Cnt); Col:= px+2; Cnt:= 0; END;
IF INSTRING("b.", Cod) <> 0 THEN Col:= Col+MAX(1, Cnt); Cnt:= 0; END;
IF INSTRING("oA", Cod) <> 0 THEN
FOR Col FROM Col TO Col+MAX(1, Cnt)-1 DO
Lignes(IP(Row/60)+1, Col):= BITOR(Lignes(IP(Row/60)+1, Col),
BITSL(#2,(Row MOD 60) ));
END;
Cnt:= 0;
END;
END;
END;
LDisp();
END;
// A RLE decoder for GoL in C.
void RLEMap(int px, int py, string RLE) {
unsigned int Row, Col, Ptr, Cnt;
char Cod;
Row = py + 1;
Col = px + 1 + 2;
Cnt = 0;
for (Ptr = 0; Ptr < RLE.length(); Ptr++) {
Cod = RLE[Ptr];
switch (Cod) {
case '0':
case '1':
case '2':
case '3':
case '4':
case '5':
case '6':
case '7':
case '8':
case '9':
Cnt = Cnt * 10 + (Cod - '0');
break;
case '$':
if (Cnt == 0)
Cnt = 1;
Row = Row + Cnt;
Col = px + 1 + 2;
Cnt = 0;
break;
case 'b':
case '.':
if (Cnt == 0)
Cnt = 1;
Col = Col + Cnt;
Cnt = 0;
break;
case 'o':
case 'O':
if (Cnt == 0)
Cnt = 1;
for (; Cnt > 0; Cnt--) {
CellSet(Row, Col);
Col++;
}
Cnt = 0;
}
}
}
4. Sokoban: A Practical Case
Back in 2013-2014, I was involved in 'HP Prime Graphing Calculator' beta testing. A short time after the calculator came out, a member of a HP fans forum published a Sakoban game for the calculator.
In addition to add the possibility to use the keyboard (original version is touch screen only), I compressed all 40 levels using RLE.
The original source code SokobanBeta1.03.prime size was 112k (UTF16). My version SokobanBeta1.5.prime size was 52k (UTF16). Roughly half the size.
Example
Level 1-1 as matrix RLE
[[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0] $
,[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0] $
,[0,0,1,1,1,1,0,0,0,0,1,1,1,0,0,0,0,0,0,0] 2b4w4b3w$
,[0,0,1,3,0,1,1,1,0,0,1,4,1,0,0,0,0,0,0,0] 2bwdb3w2bwsw$
,[0,0,1,3,0,0,0,1,1,1,1,0,1,1,1,1,0,0,0,0] 2bwd3b4wb4w$
,[0,0,1,3,0,0,0,0,0,1,1,0,1,0,0,1,0,0,0,0] 2bwd5b2wbw2bw$
,[0,0,1,1,1,0,1,0,0,0,0,0,0,0,2,1,0,0,0,0] 2b3wbw7bcw$
,[0,0,0,0,1,0,0,0,1,0,1,1,1,1,0,1,1,0,0,0] 4bw3bwb4wb2w$
,[0,0,0,1,1,0,1,0,1,0,0,0,0,0,0,0,1,0,0,0] 3b2wbwbw7bw$
,[0,0,0,1,0,2,1,0,1,0,1,0,2,0,0,0,1,0,0,0] 3bwbcwbwbwbc3bw$
,[0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,0] 3bw10b3w$
,[0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0] 3b12w$
,[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0] $
,[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0] $
,[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]] !
$$2b4w4b3w$2bwdb3w2bwsw$2bwd3b4wb4w$2bwd5b2wbw2bw$2b3wbw7bcw$4bw3bwb4wb2w$
3b2wbwbw7bw$3bwbcwbwbwbc3bw$3bw10b3w$3b12w$$$!
Matrix Size= 20* 15= 300 RLE Size= 120
Matrix Size in source code= 630
RLE Translation Table
Pattern | Encoding | Meaning |
0 | b | Background |
1 | w | Wall |
2 | c | Cube (Parcel) |
3 | d | Destination (Target) |
4 | s | Sokoban |
Digits | Number of repeats | |
$ | End of Line | |
! | End of Pattern |
Encoding
This function encodes a Sokoban level. Language is hpppc: Hewlett-Packard Prime Programming Language, a basic.
RLEEnc (Mt) // RLE encoder
BEGIN
LOCAL Row, Col, Cod, Cnt, Tmp, Sz;
Tmp:= ""; Sz:=SIZE(Mt);
FOR Row FROM 1 TO Sz(1) DO
Cnt:= 1; Cod:= Mt(Row,1);
FOR Col FROM 2 TO Sz(2) DO
IF Cod == Mt(Row,Col) THEN
Cnt:= Cnt+1;
ELSE
IF Cnt <> 1 THEN Tmp:= Tmp+ STRING(Cnt,2,0); END;
Tmp:= Tmp+ MID("bwcds",Cod+1,1);
Cnt:= 1; Cod:= Mt(Row,Col);
END;
END;
IF Cod <> 0 THEN
IF Cnt <> 1 THEN Tmp:= Tmp+ STRING(Cnt,2,0); END;
Tmp:= Tmp+ MID("bwcds",Cod+1,1);
END;
Tmp:= Tmp+ "$";
END;
RETURN Tmp;
END;
As this is calculator programming, there is no easy way to do terminal output for encoding. I needed a complementary function to get around this problem.
Encoding function is internal to the program (it is invisible outside of the progam. So I created an exported function which is callable from user calculator interface.
The interest is that the result in user interface can be copy/pasted outside of the emulator and thus used as an easy way to paste the result in source code.
// Apply RLE compression to all levels
// remove drawscreen() before converting
EXPORT Sok_RLE()
BEGIN
LOCAL Lst:={};
FOR NAJ FROM 10 TO 49 DO
loadsubnivel();
IF TYPE(matriz) == 2 THEN
Lst:= CONCAT(Lst, {{NAJ, matriz}});
ELSE
Lst:= CONCAT(Lst, {{NAJ, RLEEnc (matriz)}});
END;
END;
RETURN Lst;
END;
Decoding
Language is hpppc: Hewlett-Packard Prime Programming Language, a basic.
RLEDec (Rows, Cols, RLE) // RLE decoder
BEGIN
LOCAL Row, Col, Ptr, Cnt, Cod, Tmp;
Row:= 1; Col:= 1; Cnt:= 0;
Tmp:= MAKEMAT(0,Rows,Cols);
FOR Ptr FROM 1 TO DIM(RLE) DO
CASE
IF INSTRING("0123456789", MID(RLE, Ptr, 1)) <> 0
THEN Cnt:= Cnt*10+EXPR(MID(RLE, Ptr, 1)); END;
IF MID(RLE, Ptr, 1) == "$" THEN Row:= Row+MAX(1, Cnt); Col:= 1; Cnt:= 0; END;
DEFAULT
Cod:= INSTRING("bwcds", MID(RLE, Ptr, 1));
FOR Col FROM Col TO Col+MAX(1, Cnt)-1 DO Tmp(Row,Col):=Cod- 1; END; Cnt:= 0;
END;
END;
RETURN Tmp;
END;
5. Points of Interest
- An easy to handle data compression algorithm
6. Links
- Run-length encoding - Wikipedia[^]
- LifeWiki[^]
- Category:Patterns - LifeWiki[^]
- Sokoban - Wikipedia[^]
- Sok format - Sokoban Wiki[^]
7. History
- 21st March, 2021: First draft
- 6th September, 2021: First version
- 8th September, 2021: Second version, added Conway's Game of life
- 11th September, 2021: Third version, added Sokoban example
- 2nd January, 2022: Added reference .sok files
- 30th January, 2022: Fixed typo
- 16th January, 2022: Added a C decoder for Game of Life, download updated