Click here to Skip to main content
Click here to Skip to main content

zipstream, bzip2stream: iostream wrappers for the zlib and bzip2 libraries

By , 2 Oct 2003
 

Introduction

This article presents two zipped STL iostream implementation based on the library zlib (see download link above) and bzip2 (see download link above). This means that you can easily manipulate zipped streams like any other STL ostream/istream.

To give you an idea, consider following snippet that prints "Hello World":

ostringstream output_buffer;
// writing data
output_buffer<<"Hello world"<<endl 

Now, the same snippet but with zipped output using zlib:

// zip_ostream uses output_buffer as output buffer :)
zip_ostream zipper( output_buffer );

// writing data as usual
zipper<<"Hello world"<<endl 

or using bzip2:

// zip_ostream uses output_buffer as output buffer :)
bzip2_ostream bzipper( output_buffer );

// writing data as usual
bzipper<<"Hello world"<<endl 

As you can see adding zipped buffers into your existing applications is quite straightforward. To summarize, let's see some quick facts about zipstream and bzip2stream:

  • STL compliant,
  • any-stream-to-any-stream support,
  • char, wchar_t support,
  • fining tuning of compression properties,
  • support custom allocators (New!)

Why another wrapper? Why not use gzstream?

Writing wrappers around the zlib library is popular on CodeProject. If you search for 'zip' you will find at least 14 articles on the topic. Moreover, if you crawl on the web and especially on zlib home page, you can find dozens of other wrappers.

So why another wrapper? Well, none of the wrappers are fully STL compliant. Ok, this is not true since gzstream (see download link above) implements fstream-like STL streams. However, gzstream has three drawbacks:

  1. It does not allow buffer to buffer compression since it is based on gzip i/o methods: only file to buffer or buffer to file are supported,
  2. it is licensed under LGPL which makes it difficult to use in commercial apps,
  3. it does not support wchar_t

Last reason to write this wrapper: it is a good exercise to understand and implement iostreams.

Wrapper architecture

The three drawbacks of gzstream pushed me to re-implement an STL wrapper for zlib (and later on, do some cut and paste to get bzip2 working).

This wrapper takes a user defined i/ostream to write or read compressed data. This approach is quite flexible since the user can give any stream (istringstream, ifstream, or custom stream) to store or load the compressed data.

Internally zip_stream acts as a triple buffer: the streambuf object in itself, zlib library and the user-defined stream. For example, during the compression process, the buffers are used:

  • first buffer: the data to compress is buffered into a streambuf object,
  • second buffer: when overflow is called, the first buffer data is sent to zlib which also buffers the data internally. If zlib outputs data, it is sent to the user-defined stream,
  • third buffer: the user-defined stream is buffered.

Some care must be taken when flushing: you must use the method zflush that will first flush the streambuf, then flush the zlib buffer, then flush the user-defined stream. Note that you should avoid flushing as it degrades compression.

Implementing iostreams

Since I'm not an STL expert, I will very briefly discuss this part. There's room for a tutorial on this topic...

To implement custom iostream, you need to take the following steps:

  • implement a custom my_streambuf, inherited from streambuf. You need to override the virtual methods sync, underflow and overflow. sync and overflow are used in output streams and underflow is used in input streams.
  • implement a custom my_ostream, inherited from ostream. It will use my_streambuf
  • implement a custom my_istream, inherited from istream. It will use my_streambuf as stream buffer.

Class quick reference

All the zlib classes are in the zlib_stream namespace and all the bzip2 classes are in the bzip2_stream namespace.

The two main classes of the zlib wrapper are basic_zip_ostream and basic_zip_istream which implement respectively compression and decompression and behave like classic basic_ostream and basic_istream.

Classical typedef are also provided for these classes:

  • zip_ostream, zip_istream for char streams
  • wzip_ostream, wzip_istream for wchar_t streams.

The bzip2 classes have similar names, just replace zlib by bzip2: basic_zip_streambuf becomes basic_bzip2_streambuf.

basic_zip_ostream

This class inherits from basic_ostream:

template<
    typename Elem, 
    typename Tr = char_traits<Elem;>,
    typename ElemA = std::allocator<Elem>,
    typename ByteT = unsigned char,
    typename ByteAllocatorT = std::allocator<ByteT> 
    >
basic_zip_ostream : public basic_ostream<Elem, Tr>

where

  • Elem,Tr are the classical basic_ostream template parameters,
  • ElemA is the allocator for a Elem buffer used internally,
  • ByteT is the byte type used internally (you should not change that),
  • ByteAT is a custom allocator for a ByteT buffer used internally

Constructor

basic_zip_ostream( 
    ostream_reference ostream_, 
    bool is_gzip_ = false,
    size_t level_ = Z_DEFAULT_COMPRESSION,
    EStrategy strategy_ = DefaultStrategy,
    size_t window_size_ = 15,
    size_t memory_level_ = 8,
    size_t buffer_size_ = 4096
);
  • ostream_ is a user defined output stream,
  • is_gzip_, true if you want to add the gzip header and footer,
  • level_, compression level 0, bad and faster to 9 max and slower,
  • strategy_, compression strategy, see EStrategy enum,
  • window_size_, memory_level_ are advanced zlib settings, check zlib manual,
  • buffer_size_, read buffer size

Note that if you choose the gzip option, a header will be automatically added in the constructor and the gzip footer (CRC + data size) will be added in the destructor.

Other methods

  • Flush all buffers (zlib and ostream):
    basic_zip_ostream& zflush()

    This method must be called before using the compressed data! Since zlib does it's own buffering and ostream::flush is not virtual there is no way to avoid this problem.

  • Return the CRC of the uncompressed data:
    long get_crc();
  • Return the uncompressed data size:
    long get_in_size();
  • Return the compressed data size:
    long get_out_size();

Predefined typedefs

typedef basic_zip_ostream<char> zip_ostream;
typedef basic_zip_ostream<wchar_t> zip_wostream;

basic_zip_istream

This class inherits from basic_istream:

template<
    typename Elem, 
    typename Tr = char_traits<Elem;>,
    typename ElemA = std::allocator<Elem>,
    typename ByteT = unsigned char,
    typename ByteAT = std::allocator<ByteT> 
    >
basic_zip_istream : public basic_istream<Elem, Tr>

Constructor

basic_zip_istream( 
    istream_reference istream_, 
    size_t window_size_ = 15,
    size_t read_buffer_size_ = 1024 * 10,
    size_t input_buffer_size_ = 1024 * 5
)
  • istream_, input stream containing the compressed data,
  • window_size_, should be compatible with compression window size,
  • read_buffer_size_, size of the streambuf buffer size,
  • input_buffer_size_, size of the zlib input buffer size

Other methods

  • Tells if it is a gzip file:
    bool is_gzip() const
  • Checks CRC (must be a gzip file)
    bool check_crc() const
  • Return the CRC of the uncompressed data:
    long get_crc() const;
  • Return the uncompressed data size:
    long get_out_size() const;
  • Return the compressed data size:
    long get_in_size() const;

Predefined typedefs

typedef basic_zip_istream<char> zip_istream;
typedef basic_zip_istream<wchar_t> zip_wistream;

How to ...

All the following examples are valid for both zlib and bzip2 wrappers.

Compress to a buffer

ostringstream buffer;
zip_ostream zipper(buffer);

// writing stuff
zipper<<...

//flushing VERY IMPORTANT!
zipper.zflush();

// buffer.str() is ready

Compress to a file

ofstream file("test.zip",ios::out | ios::binary);
{
zip_ostream zipper(file, true /* gzip file*/);

// writing stuff
zipper<<...

} // the stream is flushed, the destructor is called and gzip header appended
// the file is ready

Decompress from a buffer

istringstream buffer;
zip_istream unzipper(buffer);

// reading stuff
unzipper>>...

Decompress from a file

ifstream file("test.zip", ios::in | ios::binary);
zip_istream unzipper(file);

// reading stuff
unzipper>>...

// if the file was gzip, we can check the crc
if (unzipper.is_gzip())
    std::cout<<"crc check: "<<( unzipper.check_crc() ? "ok" : "failed");

Using it in your project

Zlib wrapper

  • Read the license terms,
  • Copy zipstream.hpp and zipstream.ipp in your include directory,
  • Make sure zlib is available,
  • Add #include "zlibstream.hpp" to include the headers,

bzip2 wrapper

  • Read the license terms,
  • Copy bzip2stream.hpp and bzip2stream.ipp in your include directory,
  • Make sure zlib is available,
  • Add #include "bzip2stream.hpp" to include the headers,

History

  • 30-09-2003, 1.7, added custom allocators (suggestion of <forgot, send me your e-mail>), fixed warning in ostream constructors
  • 21-09-2003, fixed bugs with CRC and size, writing thanks to Jeroen Dirks and gigimenegolo
  • 08-08-2003, 1.5
    • Fixed gzip footer problem: CRC is read
    • data is put back in the buffer when zip file has finished
  • 07-18-2003, 1.4 Fixed gzip header problem.
  • 07-3-2003, 1.3, add bzip2 wrapper
  • 07-2-2003, 1.2, wchar_t working
  • 07-2-2003, 1.1, fixed bugs in gzip header and zip_to_stream
  • 07-01-2003, 1.0, initial release

Reference

License

These zlib and bzip2 wrappers are licensed under the zlib/libpng license.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

About the Author

Jonathan de Halleux
Engineer
United States United States
Member
Jonathan de Halleux is Civil Engineer in Applied Mathematics. He finished his PhD in 2004 in the rainy country of Belgium. After 2 years in the Common Language Runtime (i.e. .net), he is now working at Microsoft Research on Pex (http://research.microsoft.com/pex).

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
Generalx64 warnings!memberare_all_nicks_taken_or_what5 Mar '10 - 10:43 
Compiled with x64 architecture and got the following warnings:
 
zipstream.h(812): C4244: 'initializing' : conversion from 'std::streamsize' to 'int', possible loss of data
 
zipstream.h(859): C4267: 'argument' : conversion from 'size_t' to 'uInt', possible loss of data
 
zipstream.h(765): C4267: '=' : conversion from 'size_t' to 'uInt', possible loss of data
 

 
Please provide a fix!
Questionzipstream.hpp and zipstream.ipp?memberdbbtvlzfpz27 Oct '08 - 11:30 
I want to try out this library but how?
 
You wrote:
Using it in your project
Zlib wrapper
 
* Read the license terms,
* Copy zipstream.hpp and zipstream.ipp in your include directory,
* Make sure zlib is available,
* Add #include "zlibstream.hpp" to include the headers,
 
Where can I download the zipstream.hpp and zipstream.ipp? The zipstream_src.zip only contains a zip_stream_test.cpp and a zip_stream_test.hpp.
With the "Make sure zlib is available" phrase do you mean linking zlib.h and the zdll.lib? Just like this?
#include "zlibdll/include/zlib.h"
#pragma comment (lib, "zlibdll/lib/zdll.lib" )
 
Could you make this clear please?
AnswerRe: zipstream.hpp and zipstream.ipp?memberTux`1 Apr '09 - 9:45 
Download the "Demo Project." It contains all of the files needed for both zlib and bzip2. It also includes a version of zlib as well.
 
"Make sure zlib is available," that means have proper linking and proper headers in your project.
GeneralFixes for modern STL (gcc 4)memberNoGoodNicks6 Oct '07 - 8:58 
Hello,
 
is anybody able to use this templates with GCC 4.x? It seems that a lot of things has WRT to internal STL declarations.
 
I know boost::iostream but it's a bit too heavy (ie. bloaty) for the simple purpose of bzip2 file reading.

Generalzipstream is not necessary anymore, you can use the boost::iostreams lib insteadmemberherculon27 Jan '06 - 2:30 
zipstream is not necessary anymore, you can use the boost::iostreams lib instead
 
www.boost.org

Generalzipstream with boost::archivememberjhsy11 Aug '05 - 19:12 
I was trying to make zipstream works with boost::serialize, but gets a boost stream error every time I try to read the data back in, though there seems to be no problem with writing to file. Can anybody spot my mistake?
 

void save_forest_zip(const std::list &trees, const char * filename){
 
std::ofstream ofs(filename, std::ios::binary);
assert(ofs.good());
zlib_stream::zip_ostream zipos(ofs);
boost::archive::binary_oarchive oa(zipos);
 
oa << BOOST_SERIALIZATION_NVP(trees);
}
 
void restore_forest_zip(std::list &trees, const char * filename)
{
std::ifstream ifs(filename, std::ios::binary);
assert(ifs.good());
zlib_stream::zip_istream zipis(ifs);
boost::archive::binary_iarchive ia(zipis);
 
ia >> BOOST_SERIALIZATION_NVP(trees);
}
 

GeneralRe: zipstream with boost::archivememberherculon12 Aug '05 - 3:47 
zipstream ifstream reading is broken.
try instead bzip2stream! here writing and reading from file-streams works fine! and bzip2 compresses better and is more portable. you can simply concat all .c files from bzip2 sources and include them into your project. no need to build a library for each platform you are developing.
Generalsmall comment on namespacememberjhsy11 Aug '05 - 18:40 
As I'm not using the std namespace, in zipstream.ipp, I had to change ios and ios_base to std::ios and std::ios_base when using zip_istream to compile. Suggest either insert a using namespace std somewhere or qualify ios and ios_base fullly.
 
regards.
Generalbzip2stream file &lt;-&gt; file works fine, zipstream file2file fails.memberherculon25 Jul '05 - 9:07 
just wanted to notice you, that file2file zipping and unzipping with bzip2stream works fine. somehow zipstream file2file is broken. i asked Jonathan de Halleux if he could fix it, but he has no sparetime to do it. maybe someone here on the forums is able to do it?
but untial that i suggest to use bzip2. not only that it compresses better, the bzip2 library is about 10kbyte smaller than the zlib lib, if you are keen on saving programspace.
 
by the way: if you compress a targa - file using bzip2, the resulting file is SMALLER than the same image compressed with libpng Smile | :) (i think thats because libpng internally uses zlib?!?)
Questionworking on linux?memberdekim19 May '05 - 12:06 
Has anyone got this to work on linux? If yes, what did you do?
 
After looking into it further, it works on linux if pondor's suggested fixes from the 'wrong version?' message are made and if lines 455 and 458 of zipstream.ipp are modified to include std::ios.......
AnswerRe: working on linux?sussAnonymous8 Jun '05 - 15:13 
It seems to have stopped working with GCC 3.4 though Frown | :( Lots of errors.
GeneralRe: working on linux?memberirotas28 Oct '05 - 17:09 
I'm also trying to make this work on Linux.
 
Has anyone had any luck?
 
Thanks,
Adam
QuestionReading zipped file??memberRobert Bielik28 Apr '05 - 2:51 
I'm trying to verify that I can write a zip file with zip_ostream (looks good.. I think). But I cannot read the generated file with zip_istream i.e.:
 
std::ofstream os;
os.open("testfile.z", std::ios::binary);
{
zlib_stream::zip_ostream osz(os, std::ios::out, true);
osz << "Writing a lot of stuff!";
}
os.close();
 
std::ifstream is;
is.open("testfile.z", std::ios::binary);
zlib_stream::zip_istream isz(is);
while (!isz.eof())
{
char buf[512];
isz.read(buf, 512); << DOES NOT WORK!
int n = isz.gcount();
}
is.close();
 
What am I doing wrong ??
 
TIA
/Rob
 

GeneralProblem --Stream and ZipStream are in same scopemembertakuni3 Feb '05 - 19:01 
Hi.
 
The following program cause error. [assertion failed at (1) line]
But it successed using zip_ostream scope. [at (2) line]
 
Is this known problem?
 
Environment: VC6 SP5, Win2000 SP4.
 
<sample>
     const int BufferSize = 4080;
     const int LoopCnt = 5;
 
     char CheckBuffer[BufferSize];
     for(int i = 0; i < BufferSize; ++i)
     {
          CheckBuffer[i] = i + 1;
     }
 
     char Zero[BufferSize];
     memset(Zero, 0, BufferSize);
 
     char WorkBuffer[BufferSize];
 
     {
          std::ofstream Out("c:\\test.bin", std::ios::out | std::ios::binary);
//          {     // <- (2)
               zlib_stream::zip_ostream Out_Zip(Out, std::ios::out, true);
 
               for(int i = 0; i < LoopCnt; ++i)
               {
                    memcpy(WorkBuffer, CheckBuffer, BufferSize);
                    Out_Zip.write(WorkBuffer, BufferSize);
               }
               Out_Zip.zflush();
//          }     // <- (2)
          Out.close();
     }
 
     {
          std::ifstream In("c:\\test.bin", std::ios::in | std::ios::binary);
          {
               zlib_stream::zip_istream In_Zip(In);
               for(int i = 0; i < LoopCnt; ++i)
               {
                    char WorkBuffer[BufferSize];
                    memcpy(WorkBuffer, Zero, BufferSize);
                    In_Zip.read(WorkBuffer, BufferSize);
                    assert(In_Zip.good());
                    for(int j = 0; j < BufferSize; ++j)
                    {
                         assert(WorkBuffer[j] == (char)(j + 1));     // <- (1)
                    }
               }
          }
          In.close();
     }
</sample>
 
Thanks.
 

Generalbzip2stream fails with wide file streamsmemberDenis Dubrov7 Jan '05 - 9:05 
That's because wide file streams internally convert all data to multibyte characters. The solution is to use the IMBUE_NULL_CODECVT from this article: Upgrading an STL-based application to use Unicode.
[^].
 
You may also need to read the comments below to make that code more portable, especially this one:
 
http://www.codeproject.com/vcpp/stl/upgradingstlappstounicode.asp?msg=947572#xx947572xx[^]

Questionsupport for seekg/tellg?sussgreglandrum1 Dec '04 - 11:42 
These wrappers look really useful, thanks for providing them!
 
Do you have any plans to add support for the seekg and tellg to zipstream in the future? This would really widen the applicability of the wrappers.
 
Thanks again,
-greg
Generalzipstream cannot unzip the zipped filesusslost in templates20 Oct '04 - 16:01 

after finally get the demo code running... out of the following test functions, i got following results as marked in the right side of each functions.
 
zlib_stream::test_buffer_to_buffer();------------>OK
zlib_stream::test_wbuffer_to_wbuffer();------------>OK
zlib_stream::test_string_string();------------>OK
zlib_stream::test_wstring_wstring();------------>OK
zlib_stream::test_file_file(false);------------>NOT OK...
zlib_stream::test_file_file(true);------------>NOT OK...
 
In test_file_file(), compressing seems to be working with compressed output file generated... but when unzipping the zipped file, the size of the unzipped file is 0, i.e. it is not reading data from the zipped file though i can see that zipped file is not empty...Confused | :confused:
plz help...

 
lost in templates
QuestionHow to compress Huge Memory Buffer ?memberWanaDev11 Feb '04 - 17:15 
Hi,
 
I try to compress a buffer > to the default_buffer_size defined in zipstream.hpp ...
 
If you increase the constant : const size_t n=
in the test : zlib_stream::test_buffer_to_buffer(),
it fails !

Do you have a solution to compress huge memory buffer without risking a 'stack overflaw'
when increasing the default_buffer_size ?
 
I'm particularly interested by compressing huge in-memory buffer !
 
Thank you again to provide this well designed code to the dev' community.
 
Silver
 

GeneralZlib Wrapper and GZip generated buffer on IISsussAnonymous10 Feb '04 - 16:39 
Hi,
 
I try to put in place an 'in house' solution using IIS 'standard' Compression on server Side and zlib decompression on the client side.
 
I'm not running IE, so I don't use URLMon and I try to uncompress data received from my IIS Server using
Wininet directly connected to the zlib wrapper ...
 
Currently, I can successfully decompress small buffers < 30K, but big buffers causes the gzread routine to fail (returns -1).
 
If you have any Idea, it will be a great help for me.
 
I'm getting frustrated with this !
 
Regards
 
-Silver-
GeneralCompiling with VC6sussMarlon Gaspar26 Jan '04 - 8:43 
Hello All... Sorry for the novice question, but i get an error while compiling the library with VC6:
 
in zipstream.ipp : int_type is not a member of basic_zipstreambuf<...>
 
can anybody help me? Everything worked ok with VC7, but i need to hand the project with VC6 .. Frown | :(
 
thanks!
GeneralRe: Compiling with VC6memberVattila13 Mar '04 - 6:04 
I am also interested in a VC6 compatible version. So if there's anyone, please reply if you have adapted the source to compile on VC6.
GeneralRe: Compiling with VC6memberVattila15 Mar '04 - 16:10 
Ok, after doing some changes and improvements, I am now able to compile ZipStream with VC6 SP5. If anyone is interested in my modifications, let me know.
GeneralRe: Compiling with VC6memberJonathan de Halleux15 Mar '04 - 20:36 
Yes, send them to me. I'll update the article.
 
Jonathan de Halleux.

www.dotnetwiki.org

GUnit

GeneralRe: Compiling with VC6memberPatrik Müller23 Aug '04 - 23:46 
Hi,
 
will you update the article for compiling it with VC6? Or at least post an answer how to compile it...
 
Regards,
 
Patrik
GeneralRe: Compiling with VC6sussAnonymous24 Aug '04 - 22:13 
Here's the how-to:
 
- Move all the problematic method back inside the class declaration (ipp->hpp),
- there seems to be some problems with std::min, so replace it by your own min template function
 
That's it Smile | :)
 
I'll post an update soon.

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web01 | 2.6.130523.1 | Last Updated 3 Oct 2003
Article Copyright 2003 by Jonathan de Halleux
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid