Click here to Skip to main content
Click here to Skip to main content

Binary Formats in JavaScript: Base64, Deflate, and UTF8

By , 15 Jun 2008
 

Introduction

This article demonstrates the using of binary formats in JavaScript code. JavaScript, by its nature, cannot operate with binary data represented as a fragment of memory – as a byte array. That makes it difficult to use community developed algorithms and encodings. A good example is the DEFLATE compressed format. This raises more problems if the JavaScript code has to be run on a web browser: data has to be delivered over HTTP.

In the proposed implementation, a byte array is emulated by a regular JavaScript array of objects. Also, the given implementation tries to solve the problem of binary data transfer to a client-side script. Let’s assume we have DEFLATE compressed data (.NET’s System.IO.Compression namespace, Java’s java.util.zip.*, PHP’s http_deflate) and there is a way to transfer it to the client in BASE64 format.

Using the Code

The deflate.js contains the functions and classes that implement the decompression part of the DEFLATE algorithm (RFC 1951). To use this algorithm, its input has to be presented as a stream of bytes.

// create BASE64 byte stream reader
var reader = new Base64Reader(base64string);

The class exposes the readByte() method that returns the next byte, or -1 if it’s the end of the stream.

// create inflator
var inflator = new Inflator(reader);

The Inflator class, as in the previous class, exposes the readByte() method that returns the next byte from the decompressed byte stream. The binary stream can be consumed at that point.

If regular text is compressed, and it needs to be re-encoded from UTF-8 bytes to characters, we use the Utf8Translator class to retrieve the characters instead of the bytes.

// create translator
var translator = new Utf8Translator(inflator);

The class exposes the readChar() method that returns a one-character string with the next available character, or null to indicate the end of the stream. The deflate.js file also contains UnicodeTranslator and DefaultTranslator.

For convenience, there is the TextReader class that exposes not only the readChar() method, but also the readToEnd() and readLine() methods.

Those functions/classes can be used not only within the web browser's context, but in OS scripting or legacy ASP programming.

The SamplePage.htm, included in the package, displays the RFC 1951 memo content.

Points of Interest

The deflate.js functions will help to perform selective compression of data for AJAX requests. Most of the data transmitted in AJAX operations is text or a textual presentation of the binary data.

Since not all web browsers can retrieve remote data as an array of bytes (as responseBody in IE’s XmlHTTPRequest), BASE64 encoded data has to be transmitted to the client from the server. Even if BASE64 data grows 133% for its original, compression of textual data by 75% will still reduce the amount of data to be stored/transferred.

Emulation of byte array as an array of objects in JavaScript reduces the performance of the solution, e.g., to extract 50K takes 1-2 sec(s) in a web browser context.

RFC 1951, 2779, 2781, and 4648 were used to implement the underlying algorithms. There are well written memos. There are lots of formats based on the open DEFLATE compressed format (e.g., GZIP, PNG, SVGZ, SWF); implementing it in JavaScript gives one more way to access/reuse data.

License

This article, along with any associated source code and files, is licensed under The MIT License

About the Author

notmasteryet
Software Developer
United States United States
Member
No Biography provided

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
GeneralbugsmemberMember 776065128 Mar '11 - 13:44 
i got "string is not a function";
 
turns out you have this code 'throw new "string"' in three places, which is invalid.
 
also when i try to inflate sometimes it works (small files), otherwise i receive, variously;
 
Invalid block type (3)
Cannot read property 'isLeaf' of null
(and also sticking in infinite loop)
 
and the result is of incorrect length, although the right number of bytes is always read.
 
i would think my test data was not valid deflated data, only this data is always correctly inflated by the
JS inflate routine here;   http://www.onicos.com/staff/iz/amuse/javascript/expert/inflate.txt
 
if you have time to fix i can supply test data.
GeneralRe: bugsmemberMember 74021662 Aug '11 - 18:35 
Furthermore, line 56:
bitsLenght <- Typo
Which may be the origin of your issues.
 
Here's a pastebin that satisfies today's version of JSLint with tolerate bitwise and eqeqeq with only one warning.
http://pastebin.com/K6cZy05x[^]
 
This hasn't been tested.
Questionslight issue with licience, looking for a recommendationmemberMember 776065122 Mar '11 - 8:32 
i've used this code (Deflate part) in a gzip lib i've written.
 
and that lib is intended to be used in a project along with many other small libs.
 
but, for delivering JS, combining files and shortening variables is really required, but it appears to me that strictly speaking i cant do that, i need separate files and to include the copyright and licence.
 
and, even worse, does this;
 
"to any person obtaining a copy of this software and associated documentation files"
 
mean that;
 
i, or more problematically someone downstream, has to obtain a copy of "associated documentation files", for each lib.
 
in which case, for this project, what are they? would they include the whole article? i guess not, but it isn't clear to me.
 
maybe im being too pedantic and this isn't really specific to this code anyway, but im trying to check this out thoroughly as a test-bed for future work.
AnswerRe: slight issue with licience, looking for a recommendationmemberMember 74021662 Aug '11 - 18:40 
Search the internet for the rfc numbers with query: "RFC ####" from the stuff in the comments and the search results will lead to the appropriate documents.
Questionhow can get char by index?memberbaihongmei23 Mar '09 - 23:50 
It's excellent of your work.
 
But I have a question. I need get the specified char by index.
 
If I use textReader.readChar() one by one, and it's very slow when the data is large(my data is about 2m large after inflate).
 
for example, after inflate the data is an text like is "1,0,0.2,1,0...". I want to split it into an array but it's very slow.
 
I tried to read char by char, it's also waste of time when the index is large. What's the use of bufferPosition?
 
can you give me some suggestion of this? how can I get the char as soon as possible?
AnswerRe: how can get char by index?membernotmasteryet24 Mar '09 - 12:30 
I recommend read character by character, but try to avoid manipulation with large strings. You may store your data in array of strings of equal length (i.e. 1000 character), I will improve performance of string concatenation operations; at the end you may combine them in one large string.
 
If you will have in the data only numbers separated by the comma and you need to parse it into array of numbers, you can build that array by chunks: read 1000 characters and then read until you find comma, parse fragment, join fragment with main array of numbers, and repeat reading until end is reached.
Generalgot stuckmemberarun_srajan30 Jan '09 - 2:26 
i used base64reader its working but when used inflate got stuck in Invalid block type 0 length
GeneralRe: got stuckmembernotmasteryet30 Jan '09 - 7:26 
Make sure you are using properly encoded DEFLATE and then BASE64 stream as an input (e.g. 'N' -> 01 01 00 FE FF 4E -> 'AQEA/v9O').
GeneralRe: got stuckmemberarun_srajan30 Jan '09 - 18:01 
nothing wrong in data, but bug in inflate function in deflate.js can you able to debug the code. Because the same works in java.util.zip inflate and deflate. i tell you the process how i did i deflated the data using java.util.zip library and coded to 64 base and in javascript i used your code from base64 to byte and i compared the result, it was fantastic. but when i pass the same base64 decoded data to jsp page using ajax calls and inflate it is decompressing correctly. I think you have bug in code.
AnswerRe: got stuckmembernotmasteryet31 Jan '09 - 4:24 
Make sure you set nowrap to true (e.g. Deflater(Deflater.DEFAULT_COMPRESSION, true)). Otherwise you will have ZLIB header before DEFLATE data, which cases result you described. Could you post shortest base64 data encoded by java.util.zip your way that causes troubles?

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web01 | 2.6.130523.1 | Last Updated 16 Jun 2008
Article Copyright 2008 by notmasteryet
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid