Click here to Skip to main content
15,867,308 members
Articles / Programming Languages / C++
Article

Zip bytes in memory and unzip file into bytes buffer

Rate me:
Please Sign up or sign in to vote.
4.31/5 (17 votes)
7 Sep 20042 min read 81.8K   2.1K   47   12
An article on how to zip bytes in memory

Introduction

Currently, I am coding a web server running under MS windows, for performance, I need to zip htm files in memory and return gzipped content to client. After googling for a while, I couldn't find what I needed, maybe I missed it. If you don't want to code Huffman and LZ77 arithmetic by yourself from scratch, this is a shortcut for you. What I have done is just extracting all deflate codes into just one header file 'deflate.h', omit all things that can be, inline all functions. Thanks to god, there is no cyclic references problem in the end.

The second job was done in last year, which can be used to unzip file. I also extracted all source relate to 'inflate' into one header file 'inflate.h'. I wish you don't call me "c++ scripter" after using them :-)

By using this, there will be nothing needed such as linking with gzip lib in your project. gzip is a large lib, for easy reading, I omitted the copyright statement, but I wish you do that!

Using deflate.h to zip bytes buffer

You should use the compress function as following:

FILE* out;

Byte*    outbuf = (Byte*)ALLOC(OUTBUFLEN);
int      outlen = 0;

Byte*    inbuf = (Byte*)ALLOC(INBUFLEN);
int      len;
int      a;

FILE* in = fopen("aaa.htm","rb");
len = fread(inbuf,sizeof(Byte),INBUFLEN,in);
fclose(in);

//for(a=0;a<1000;a++) // for test purpose.
  outlen = compress(inbuf,len, outbuf);

out = fopen("bbb.gz","wb");
len = fwrite(outbuf,sizeof(Byte),outlen,out);
fclose(out);

TRYFREE(outbuf);
TRYFREE(inbuf);

As you know, you can use other calc'ed result byte buffer as your input. What I added into gzip is the "compress" function in the end of 'deflate.h' head file, which is like the following:

int compress(Byte* inbuf,int len, Byte* outbuf)
{
  int err    = Z_OK;
  z_stream  strm;
  uLong crc  = 0;     /* crc32 of uncompressed data */
  int done  = 0;
  int  outlen  = 0;

  Byte*    nextout; /* output buffer */
  strm.zalloc  = (alloc_func)0;
  strm.zfree  = (free_func)0;
  strm.opaque  = (Byte*)0;
  strm.next_in  = Z_NULL;
  strm.next_out = Z_NULL;
  strm.avail_in = strm.avail_out = 0;
  
  err = deflateInit2(&strm, Z_BEST_COMPRESSION, Z_DEFLATED, 
    -MAX_WBITS, DEF_MEM_LEVEL, Z_DEFAULT_STRATEGY);
  /* windowBits is passed < 0 to suppress zlib header */
  strm.next_out = (Byte*)ALLOC(Z_BUFSIZE);
  nextout = strm.next_out;

  strm.avail_out = Z_BUFSIZE;

  // write gzip header
  outbuf[outlen++] = 0x1f;
  outbuf[outlen++] = 0x8b;
  outbuf[outlen++] = Z_DEFLATED;
  for(;outlen<9;)
    outbuf[outlen++] = 0;
  outbuf[outlen++] = 0xff;

  strm.next_in  = (Byte*)inbuf;
  strm.avail_in = len;

  while (strm.avail_in != 0)
  {
      err = deflate(&strm, Z_NO_FLUSH);
        if (err != Z_OK) break;
  }

  crc = crc32(crc, (const Byte *)inbuf, len);
  if (strm.avail_in != 0) 
    return -1;  
  
  /* should be zero already anyway */
  for (;;) {
    len = Z_BUFSIZE - strm.avail_out;

    if (len != 0) {
      zmemcpy(outbuf+outlen, nextout, len);
      
      outlen += len;

      strm.next_out = nextout;
      strm.avail_out = Z_BUFSIZE;
    }
    if (done) 
      break;
      
    err = deflate(&strm, Z_FINISH);
    done = (strm.avail_out != 0 || err == Z_STREAM_END);
  }

  // write gzip tailer
  outbuf[outlen++] = (Byte)crc&0xff;
  outbuf[outlen++] = (Byte)crc>>8&0xff;
  outbuf[outlen++] = (Byte)crc>>16&0xff;
  outbuf[outlen++] = (Byte)crc>>24&0xff;
  
  outbuf[outlen++] = (Byte)strm.total_in;
  outbuf[outlen++] = (Byte)strm.total_in>>8;
  outbuf[outlen++] = (Byte)strm.total_in>>16;
  outbuf[outlen++] = (Byte)strm.total_in>>24;

  if (strm.state != Z_NULL) 
    err = deflateEnd(&strm);

  TRYFREE(strm.next_out);
  return outlen;
}     

if you dislike those typedef's such as Byte in your project, you can undefine them in the end of the header file.

Using inflate.h to unzip file into bytes buffer.

As to unzip file, you include 'inflate.h' in your project, and use it like the following:

  LUFILE*    zf = NULL;
  DWORD e;    // for debug use
  zf = lufopen(bstrTheme,0,2,&e);
  if (zf==NULL) 
    return FALSE;
  long nlen;
  BYTE* buf = NULL;
  ATLTRY(buf = new BYTE[MAX_LEN]);
  char szCurFileName[UNZ_MAXFILENAMEINZIP+1];

  unzFile uf = unzOpenInternal(zf); // there is a gotofirstfile in the end
  if(!uf) 
    goto LError;

  int err = unzGoToFirstFile(uf);
  while (err == UNZ_OK)
  {
    unzGetCurrentFileInfo(uf,NULL,szCurFileName,
           sizeof(szCurFileName)-1,NULL,0,NULL,0);
    int n = lstrlenA(szCurFileName)-1;

    unzOpenCurrentFile(uf);
    bool haderr=false;
    nlen = 0;
    for (;;)
    { 
      int res = unzReadCurrentFile(uf,buf,MAX_LEN);
      nlen += res;
      if (res<0) 
      {
        haderr=true; 
        break;
      }
      if (res==0) 
        break;
    }
    err = unzGoToNextFile(uf);
  }

LError:
  if(zf) 
    lufclose(zf);

Note, this header was extracted from zlib version 1.1.3, and the 'deflate.h' was extracted from gzip 1.2.4.

Points of Interest

if you want more compression ratio, you can take this as your start, and change those configuration value, maybe other stuff :-P If you successfully did that at last, please let me know. Anyway, don't ask me what's LZ77, I know nothing about it, and I also don't know who Huffman is, and so on, sorry for my ignorance. As I think, he must be a boring man ;-)

Conclusion

All these 2 files had been tested by using Winzip 9.0

Comments are appreciated, thanks for your time.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here


Written By
Software Developer (Senior)
China China
This member has not yet provided a Biography. Assume it's interesting and varied, and probably something to do with programming.

Comments and Discussions

 
GeneralThank you! Pin
Member 17331325-Sep-04 0:16
Member 17331325-Sep-04 0:16 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Praise Praise    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.