Click here to Skip to main content
15,036,145 members
Articles / General Programming / Compression
Article
Posted 7 Sep 2021

Stats

6.6K views
129 downloads
6 bookmarked

RLE: The Human Friendly Compression

Rate me:
Please Sign up or sign in to vote.
4.74/5 (5 votes)
10 Sep 2021CPOL4 min read
RLE (Run Length Encoding): The Human Friendly Compression
In this article, we will take a look at RLE which is a lossless data compression algorithm.

Download

Background

Every time one want to save space for data storage, compression is the answer. RLE is a very simple compression algorithm.

The xPrompt Download and prg files

xPrompt.zip contains the binary of the xHarbour Scripting Engine for Windows. This is the app needed to run the prg code.

Usage :
- Copy prg file in same directory than xPrompt
- Launch xPrompt
- Type "do FileName.prg" to run the code

xHarbour is freeware and can be dowloaded at xHarbour.org[^]

xHarbour is part of the xBase family languages: xBase - Wikipedia[^]

Contents

  1. Introduction
  2. The RLE Algorithm
    1. Example
    2. RLE Translation table
    3. Encoding
    4. Decoding
  3. Conway's Game of Life
    1. Example
    2. RLE Translation table
    3. Encoding
    4. Decoding
  4. Sokoban: a practical case
    1. Example
    2. RLE Translation table
    3. Encoding
    4. Decoding
  5. Points of Interest
  6. Links
  7. History

1. Introduction

RLE means 'Run Lentgh Encoding', this compression method focusses on repeats (runs) of same value to achive compression. This also means that it is efficient only on well suited datasets. As it is really simple, encoding and decoding can be done by hand.

2. The RLE Algorithm

RLE is a lossless data compression algorithm applied to text. The principle is to replace runs of same data by the count and the data value only once. The operation is interesting for runs of 3 or more repeats. Since we use digits to encode runs, when input include digits, a translation must be done to avoid conflict between data and counts.

RLE is interesting everytime one get large bitmaps with long runs of same data:

  • Faxes : RLE is in world wide use with faxes, because a fax is a drawing made of dots that are only black or white and with large runs of same color.
  • Games levels : Games with 2D map levels like Sokoban are good candidats because at any point in game, only current level is needed in uncompressed form.
  • Large dataset : Conway's Game of Life is a good example of large datsets with only 2 values, and dominated by dead cells.

Example

In this example, bitmap is the input, on right is the RLE encoding for each line. Below is the full RLE. And last is the size of input and the size of RLE. The output is not optimized.

Bitmap Letter A
Bitmap                 RLE
...................... 22.$
...................... 22.$
.......xxxxxxx........ 7.7x8.$
.....xxxxxxxxxxx...... 5.11x6.$
....xxxxxxxxxxxxx..... 4.13x5.$
....xxxx......xxxx.... 4.4x6.4x4.$
...xxxx........xxx.... 3.4x8.3x4.$
...xxx.........xxx.... 3.3x9.3x4.$
...............xxx.... 15.3x4.$
..............xxxx.... 14.4x4.$
......xxxxxxxxxxxx.... 6.12x4.$
....xxxxxxxxxxxxxx.... 4.14x4.$
...xxxxxxxxx...xxx.... 3.9x3.3x4.$
..xxxxx........xxx.... 2.5x8.3x4.$
..xxx..........xxx.... 2.3x10.3x4.$
..xxx.........xxxx.... 2.3x9.4x4.$
..xxx.........xxxx.... 2.3x9.4x4.$
..xxxx......xxxxxx.... 2.4x6.6x4.$
...xxxxxxxxxxx.xxxxx.. 3.11x.5x2.$
....xxxxxxxxx...xxxx.. 4.9x3.4x2.$
.....xxxxxx.....xxxx.. 5.6x5.4x2.$
...................... 22.$
...................... 22.!
22.$22.$7.7x8.$5.11x6.$4.13x5.$4.4x6.4x4.$3.4x8.3x4.$3.3x9.3x4.$15.3x4.$14.4x4.$6.12x4.$4.14x4.$3.9x3.3x4.$2.5x8.3x4.$2.3x10.3x4.$2.3x9.4x4.$2.3x9.4x4.$2.4x6.6x4.$3.11x.5x2.$4.9x3.4x2.$5.6x5.4x2.$22.$22.!
Bitmap size= 506 RLE size= 204

RLE Translation table

Pattern Encoding Meaning
. (dot) . (dot) White point
x x Black point
  Digits Number of repeats
  $ End of Line
  ! End of Pattern

Encoding

In RLE.zip, program file is RLE_Enc_fax.prg, input pattern is in file Fax.txt, result is in file Fax.log.

dBase
* RLE encoding for Simple Bitmap
clear screen

/* Data file structure
pattern Name on first line
Pattern
empty line to indicate the end of pattern
*/

mdl= {}
* Read input file handle= fopen("Fax.txt")
tmp= freadln(handle, ,1024)
do while len(tmp) > 0
	tmp= left(tmp,len(tmp)-2) // remove cariage return
	aadd(mdl, tmp)
	tmp= freadln(handle, ,1024)
enddo
fclose(handle)

* process input file set printer to Fax.log
set printer on
title= .T.
for scan=1 to len(mdl)
	? mdl[scan] // output pattern name
	rle= ""
	bitmap= 0

	scan++
	do while len(mdl[scan]) != 0
		? mdl[scan] // output pattern line
		rle_line= ""
		bitmap += len(mdl[scan])
		last= mdl[scan,1]
		cnt= 1
		for scanc=2 to len(mdl[scan])
			if last= mdl[scan, scanc]
				cnt++
			else
				if cnt> 1
					rle_line += str(cnt,,, .t.)
				endif
				rle_line += last

				last= mdl[scan, scanc]
				cnt= 1
			endif
		next
		* if last != "."
			if cnt> 1
				rle_line += str(cnt,,, .t.)
			endif
			rle_line += last
		* endif
		scan++
		if len(mdl[scan]) = 0
			rle_line += "!"
		else
			rle_line += "$"
		endif
		?? " "
		?? rle_line // ouput line encoding
		rle += rle_line

	enddo
	? rle // output full RLE
	? "Bitmap size= "+str(bitmap,,, .t.)
	?? " RLE size= "+str(len(rle),,, .t.)
	?

next
set printer off
set printer to
return

Decoding

In RLE.zip, program file is RLE_Dec_fax.prg.

dBase
* RLE decoding for simple Bitmap
clear screen

pattern= "22.$22.$7.7x8.$5.11x6.$4.13x5.$4.4x6.4x4.$3.4x8.3x4.$3.3x9.3x4.$15.3x4.$14.4x4.$6.12x4.$4.14x4.$3.9x3.3x4.$2.5x8.3x4.$2.3x10.3x4.$2.3x9.4x4.$2.3x9.4x4.$2.4x6.6x4.$3.11x.5x2.$4.9x3.4x2.$5.6x5.4x2.$22.$22.!"

* process RLE pattern cnt=0
for scan=1 to len(pattern)
	char= pattern[scan]
	if isdigit(char)
		cnt= cnt*10+ val(char)
	else
		if char= "$" .or. char= "!"
			? // nex line
		else
			if cnt= 0
				cnt=1
			endif
			for rep= 1 to cnt
				? char
			next
		endif
		cnt= 0
	endif
next
?

return

3. Conway's Game of Life

GoL patterns fit very well with RLE compression because datasets are large, a cell is either dead or alive, and dead cells dominate datasets.

Run Length Encoded - LifeWiki[^]

Hundred of RLE compressed patterns can be found at Category:Patterns - LifeWiki[^].

Example

2-engine Cordership
...................OO.................... 19b2O$
...................OOOO.................. 19b4O$
...................O.OO.................. 19bOb2O$
......................................... $
....................O.................... 20bO$
...................OO.................... 19b2O$
...................OOO................... 19b3O$
.....................O................... 21bO$
.................................OO...... 33b2O$
.................................OO...... 33b2O$
......................................... $
......................................... $
......................................... $
......................................... $
......................................... $
......................................... $
....................................O.... 36bO$
...................................OO.... 35b2O$
..................................O...O.. 34bO3bO$
...................................OO..O. 35b2O2bO$
........................................O 40bO$
.....................................O.O. 37bObO$
......................................O.. 38bO$
......................................O.. 38bO$
......................................OO. 38b2O$
......................................OO. 38b2O$
......................................... $
......................................... $
.............O..........O................ 13bO10bO$
............OOOOO.....O.OO...........O... 12b5O5bOb2O11bO$
...........O..........O...O.........O.... 11bO10bO3bO9bO$
............OO........OOO.O.........OO... 12b2O8b3ObO9b2O$
.............OO.........OO............O.. 13b2O9b2O12bO$
OO.............O.....................OOO. 2O13bO21b3O$
OO...................................OOO. 2O35b3O$
......................................... $
......................................... $
......................................... $
......................................... $
......................................... $
......................................... $
........OO............................... 8b2O$
........OO...........OO.................. 8b2O11b2O$
...................OO..O................. 19b2O2bO$
........................O...O............ 24bO3bO$
..................O.....O...O............ 18bO5bO3bO$
...................O..OO...O.O........... 19bO2b2O3bObO$
....................OOO.....O............ 20b3O5bO$
............................O............ 28bO!
19b2O$19b4O$19bOb2O$$20bO$19b2O$19b3O$21bO$33b2O$33b2O$$$$$$$36bO$35b2O$34bO3bO$35b2O2bO$40bO$37bObO$38bO$38bO$38b2O$38b2O$$$13bO10bO$12b5O5bOb2O11bO$11bO10bO3bO9bO$12b2O8b3ObO9b2O$13b2O9b2O12bO$2O13bO21b3O$2O35b3O$$$$$$$8b2O$8b2O11b2O$19b2O2bO$24bO3bO$18bO5bO3bO$19bO2b2O3bObO$20b3O5bO$28bO!
Bitmap size= 2009 RLE size= 292

Result without end of line optimization. Pay attention diffrence in RLE size.

19b2O20b$19b4O18b$19bOb2O18b$41b$20bO20b$19b2O20b$19b3O19b$21bO19b$33b2O6b$33b2O6b$41b$41b$41b$41b$41b$41b$36bO4b$35b2O4b$34bO3bO2b$35b2O2bOb$40bO$37bObOb$38bO2b$38bO2b$38b2Ob$38b2Ob$41b$41b$13bO10bO16b$12b5O5bOb2O11bO3b$11bO10bO3bO9bO4b$12b2O8b3ObO9b2O3b$13b2O9b2O12bO2b$2O13bO21b3Ob$2O35b3Ob$41b$41b$41b$41b$41b$41b$8b2O31b$8b2O11b2O18b$19b2O2bO17b$24bO3bO12b$18bO5bO3bO12b$19bO2b2O3bObO11b$20b3O5bO12b$28bO12b!
Bitmap size= 2009 RLE size= 413

RLE Translation table

Pattern Encoding Meaning
. (dot) b Dead cell
O o Alive cell
  Digits Number of repeats
  $ End of Line
  ! End of Pattern

Since playfield defaults to dead cells, dead cells ending a line can be removed.

Encoding

In RLE.zip, program file is RLE_Enc_gol.prg, input pattern is in file Gol.txt, result is in file Gol.log.

dBase
* RLE encoding for Convay's Game of Life
clear screen

/* Data file structure
GoL pattern Name on first line
Pattern
empty line to indicate the end of pattern
*/

mdl= {}
* Read input file handle= fopen("gol.txt")
tmp= freadln(handle, ,1024)
do while len(tmp) > 0
	tmp= left(tmp,len(tmp)-2) // remove cariage return
	aadd(mdl, tmp)
	tmp= freadln(handle, ,1024)
enddo
fclose(handle)

* process input file set printer to GoL.log
set printer on
title= .T.
for scan=1 to len(mdl)
	? mdl[scan] // pattern name
	rle= ""
	bitmap= 0

	scan++
	do while len(mdl[scan]) != 0
		? mdl[scan] // output pattern line
		rle_line= ""
		bitmap += len(mdl[scan])
		last= mdl[scan,1]
		cnt= 1
		for scanc=2 to len(mdl[scan])
			if last= mdl[scan, scanc]
				cnt++
			else
				if cnt> 1
					rle_line += str(cnt,,, .t.)
				endif
				if last= "."
					rle_line += "b" // translation
				else
					rle_line += last
				endif

				last= mdl[scan, scanc]
				cnt= 1
			endif
		next
		if last != "."  // EOL optimization
			if cnt> 1
				rle_line += str(cnt,,, .t.)
			endif
			if last= "."
				rle_line += "b" // translation
			else
				rle_line += last
			endif
		endif
		scan++
		if len(mdl[scan]) = 0
			rle_line += "!"
		else
			rle_line += "$"
		endif
		?? " "
		?? rle_line // ouput line encoding
		rle += rle_line

	enddo
	? rle // output full RLE
	? "Bitmap size= "+str(bitmap,,, .t.)
	?? " RLE size= "+str(len(rle),,, .t.)
	?

next
set printer off
set printer to
return

Decoding

In RLE.zip, program file is RLE_Dec_Gol.prg.

dBase
* RLE decoding for simple Bitmap
clear screen

pattern= "19b2O$19b4O$19bOb2O$$20bO$19b2O$19b3O$21bO$33b2O$33b2O$$$$$$$36bO$35b2O$34bO3bO$35b2O2bO$40bO$37bObO$38bO$38bO$38b2O$38b2O$$$13bO10bO$12b5O5bOb2O11bO$11bO10bO3bO9bO$12b2O8b3ObO9b2O$13b2O9b2O12bO$2O13bO21b3O$2O35b3O$$$$$$$8b2O$8b2O11b2O$19b2O2bO$24bO3bO$18bO5bO3bO$19bO2b2O3bObO$20b3O5bO$28bO!"

* process RLE pattern cnt= 0
for scan=1 to len(pattern)
	char= pattern[scan]
	if isdigit(char)
		cnt= cnt*10+ val(char)
	else
		if char= "$" .or. char= "!"
			? // nex line
		else
			if cnt= 0
				cnt=1
			endif
			if char= "b"
				char= "."
			endif
			for rep= 1 to cnt
				?? char
			next
		endif
		cnt= 0
	endif
next
?

return
BASIC
' Sample program for HP Prime calculator
RLEMap (px, py, RLE)
' Game of Life RLE decoding
BEGIN
  LOCAL Row, Col, Ptr, Cnt, Cod;
  Row:= py; Col:= px+2; Cnt:= 0;
  FOR Ptr FROM 1 TO DIM(RLE) DO
    Cod:= MID(RLE, Ptr, 1);
    CASE
      IF INSTRING("0123456789", Cod) <> 0 THEN Cnt:= Cnt*10+EXPR(Cod); END;
      IF Cod == "$" THEN Row:= Row+MAX(1, Cnt); Col:= px+2; Cnt:= 0; END;
      IF INSTRING("b.", Cod) <> 0 THEN Col:= Col+MAX(1, Cnt); Cnt:= 0; END;
      IF INSTRING("oA", Cod) <> 0 THEN
        FOR Col FROM Col TO Col+MAX(1, Cnt)-1 DO
          Lignes(IP(Row/60)+1, Col):= BITOR(Lignes(IP(Row/60)+1, Col), BITSL(#2,(Row MOD 60) ));
        END;
        Cnt:= 0;
      END;
    END;
  END;
  LDisp();
END;

4. Sokoban: a practical case

Back in 2013-2014, I was involved in 'HP Prime Graphing Calculator' beta testing. A short time after the calculator came out, a member of a HP fans forum published a Sakoban game for the calculator.

The Museum of HP Calculators[^]
Sokoban game for HP Prime[^]

In addition to add the possibility to use the keyboard (original version is touch screen only), I compressed all 40 levels using RLE.

The original source code SokobanBeta1.03.prime size was 112k (UTF16). My version SokobanBeta1.5.prime size was 52k (UTF16). Roughly half the size.

Example

Level 1-1 as matrix                          RLE
[[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]   $                   
,[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]   $                   
,[0,0,1,1,1,1,0,0,0,0,1,1,1,0,0,0,0,0,0,0]   2b4w4b3w$           
,[0,0,1,3,0,1,1,1,0,0,1,4,1,0,0,0,0,0,0,0]   2bwdb3w2bwsw$       
,[0,0,1,3,0,0,0,1,1,1,1,0,1,1,1,1,0,0,0,0]   2bwd3b4wb4w$        
,[0,0,1,3,0,0,0,0,0,1,1,0,1,0,0,1,0,0,0,0]   2bwd5b2wbw2bw$      
,[0,0,1,1,1,0,1,0,0,0,0,0,0,0,2,1,0,0,0,0]   2b3wbw7bcw$         
,[0,0,0,0,1,0,0,0,1,0,1,1,1,1,0,1,1,0,0,0]   4bw3bwb4wb2w$       
,[0,0,0,1,1,0,1,0,1,0,0,0,0,0,0,0,1,0,0,0]   3b2wbwbw7bw$        
,[0,0,0,1,0,2,1,0,1,0,1,0,2,0,0,0,1,0,0,0]   3bwbcwbwbwbc3bw$    
,[0,0,0,1,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,0]   3bw10b3w$           
,[0,0,0,1,1,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0]   3b12w$              
,[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]   $                   
,[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]   $                   
,[0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]]  !                   
$$2b4w4b3w$2bwdb3w2bwsw$2bwd3b4wb4w$2bwd5b2wbw2bw$2b3wbw7bcw$4bw3bwb4wb2w$3b2wbwbw7bw$3bwbcwbwbwbc3bw$3bw10b3w$3b12w$$$!
Matrix Size= 20* 15= 300  RLE Size= 120
Matrix Size in source code= 630

RLE Translation table

Pattern Encoding Meaning
0 b Background
1 w Wall
2 c Cube (Parcel)
3 d Destination (Target)
4 s Sokoban
  Digits Number of repeats
  $ End of Line
  ! End of Pattern

Encoding

This function encode a Sokoban level. Language is hpppc: Hewlett-Packard Prime Programming Language, a basic.

BASIC
RLEEnc (Mt) // RLE encoder
BEGIN
  LOCAL Row, Col, Cod, Cnt, Tmp, Sz;
  Tmp:= ""; Sz:=SIZE(Mt);
  FOR Row FROM 1 TO Sz(1) DO
    Cnt:= 1; Cod:= Mt(Row,1);
    FOR Col FROM 2 TO Sz(2) DO
      IF Cod == Mt(Row,Col) THEN
        Cnt:= Cnt+1;
      ELSE
        IF Cnt <> 1 THEN Tmp:= Tmp+ STRING(Cnt,2,0); END;
        Tmp:= Tmp+ MID("bwcds",Cod+1,1);
        Cnt:= 1; Cod:= Mt(Row,Col);
      END;
    END;
    IF Cod <> 0 THEN
      IF Cnt <> 1 THEN Tmp:= Tmp+ STRING(Cnt,2,0); END;
      Tmp:= Tmp+ MID("bwcds",Cod+1,1);
    END;
    Tmp:= Tmp+ "$";
  END;
  RETURN Tmp;
END;

As this is calculator programming, there is no easy way to do terminal output for encoding. I needed a complementary function to get around this problem.

Encoding function is internal to the program (it is invisible outside of progam. So I created an exported function which is calleable from user calculator interface.

The interest is that the result in user interface can be copy/pasted outside of the emulator and thus used as an easy way to paste the result in source code.

BASIC
// Apply RLE compression to all levels
// remove drawscreen() before converting
EXPORT Sok_RLE()
BEGIN
  LOCAL Lst:={};
  FOR NAJ FROM 10 TO 49 DO
    loadsubnivel();
    IF TYPE(matriz) == 2 THEN
      Lst:= CONCAT(Lst, {{NAJ, matriz}});
    ELSE
      Lst:= CONCAT(Lst, {{NAJ, RLEEnc (matriz)}});
    END;
  END;
  RETURN Lst;
END;

Decoding

Language is hpppc: Hewlett-Packard Prime Programming Language, a basic.

BASIC
RLEDec (Rows, Cols, RLE) // RLE decoder
BEGIN
  LOCAL Row, Col, Ptr, Cnt, Cod, Tmp;
  Row:= 1; Col:= 1; Cnt:= 0;
  Tmp:= MAKEMAT(0,Rows,Cols);
  FOR Ptr FROM 1 TO DIM(RLE) DO
    CASE
      IF INSTRING("0123456789", MID(RLE, Ptr, 1)) <> 0 THEN Cnt:= Cnt*10+EXPR(MID(RLE, Ptr, 1)); END;
      IF MID(RLE, Ptr, 1) == "$" THEN Row:= Row+MAX(1, Cnt); Col:= 1; Cnt:= 0; END;
      DEFAULT
      Cod:= INSTRING("bwcds", MID(RLE, Ptr, 1));
      FOR Col FROM Col TO Col+MAX(1, Cnt)-1 DO Tmp(Row,Col):=Cod- 1; END; Cnt:= 0;
    END;
  END;
  RETURN Tmp;
END;

5. Points of Interest

An easy to handle data compression algorithm.

7. History

  • 21st March, 2021: First draft
  • 6th September, 2021: First version
  • 8th September, 2021: Second version, added Conway's Game of life
  • 11th September, 2021: Third version, added Sokoban example

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Patrice T
Database Developer
France France
I am a professional programmer.
Problem analyse is certainly what I am best at.
My main programming expertise is in the xBase languages (dBase, Clipper, FoxPro, Harbour, xHarbour), then VBA for Excel and advanced Excel WorkBooks.

I also have knowledge on C/C++, d language, HTML, SVG, XML, XSLT, Javascript, PHP, BASICs, Python, COBOL, Assembly.
My personal interests goes to algorithm optimization and puzzles.

Comments and Discussions

 
QuestionIncomplete article Pin
tbayart13-Sep-21 3:13
professionaltbayart13-Sep-21 3:13 
AnswerRe: Incomplete article Pin
Patrice T13-Sep-21 8:44
mvePatrice T13-Sep-21 8:44 
GeneralRe: Incomplete article Pin
Patrice T14-Sep-21 6:06
mvePatrice T14-Sep-21 6:06 
GeneralRe: Incomplete article Pin
tbayart14-Sep-21 8:38
professionaltbayart14-Sep-21 8:38 
GeneralMy vote of 5 Pin
Adérito Silva7-Sep-21 18:30
MemberAdérito Silva7-Sep-21 18:30 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.