Click here to Skip to main content
Click here to Skip to main content

ZipStorer - A Pure C# Class to Store Files in Zip

By , 15 Mar 2010
 
Zip_Storer - Click on image to enlarge

Introduction

There are many techniques to produce Zip files in a .NET 2.0 environment, like the following:

  • Using the java.util.zip namespace
  • Invoking Shell API features
  • Using a third-party .NET library
  • Wrapping and marshalling a non-.NET library
  • Invoking a compression tool at command-line

I have tested most of them, each one has pros and cons, but sometimes I just needed a tiny library to store files in a Zip with basic compression or plain storing. I have built my own minimalistic class to create Zip files and store/retrieve files to/from it, firstly with uncompressed storing capabilities and now with Deflate algorithm. no other compression methods supported.

Moreover, notice that the new .NET 3.0 and 3.5 Frameworks come with the ZipPackage class, but it is not available for .NET 2.0 or Compact Framework applications. A restriction of ZipPackage is that you cannot avoid generating an extra file inside named [Content_Type].xml.

Background

The following diagram depicts a Zip file structure; you will notice it is a bit redundant because of its double directory approach (local and central). This is because it is designed to support creation in a sequential-access-only device.

Screenshot - Zip_Structure.png

The contents of each section can vary depending on the Operating System and hardware platform. The original PKWare specification has been included with this article.

Using the Code

The ZipStorer class is the unique one needed to create the zip file. It contains a nested structure (ZipFileEntry) to collect each directory entry. The class has been declared inside the System.IO namespace. The following diagram describes all the ZipStorer class members:

Class_Diagram.png

There is no default constructor. There are two ways to construct a new ZipStorer instance, depending on specific needs: use either Create() or Open() static methods. To create a new Zip file, use the Create() method like this:

ZipStorer zip = ZipStorer.Create(filename, comment);  // file-oriented version
ZipStorer zip = ZipStorer.Create(stream, comment);  // stream-oriented version

It is required to specify the full path for the new zip file, or pass a valid stream, and optionally add a comment. To open an existing zip file for appending, the Open() method is required, like the following:

ZipStorer zip = ZipStorer.Open(filename, fileaccess);  // file-oriented version
ZipStorer zip = ZipStorer.Open(stream, fileaccess);  // stream-oriented version

Where fileaccess should be of type System.IO.FileAccess enumeration type. Also, as now ZipStorer is derived from IDisposable interface, the using keyword can be used to ensure proper disposing of the storage resource:

using (ZipStorer zip = ZipStorer.Create(filename, comment))
{
    // some operations with zip object
    //
}   // automatic close operation here

To add files into an opened zip storage, there are two available methods:

public void AddFile(ZipStorer.Compress _method, string _pathname, string _filenameInZip,
string _comment);
public void AddStream(ZipStorer.Compress _method, string _filenameInZip, Stream _source,
    DateTime _modTime, string _comment);

The first method allows you to add an existing file to the storage. The first argument receives the compression method; it can be Store or Deflate enum values. The second argument admits the physical path name, the third one allows to change the path or file name to be stored in the Zip, and the last argument inserts a comment in the storage. Notice that the folder path in the _pathname argument is not saved in the Zip file. Use the _filenameInZip argument instead to specify the folder path and filename. It can be expressed with both slashes or backslashes.

The second method allows you to add data from any kind of stream object derived from the System.IO.Stream class. Internally, the first method opens a FileStream and calls the second method.

Finally, you have to close the storage with the Close() method. This will save the central directory information too. Alternatively, you can use Dispose() method.

Sample Application

The provided sample application will ask for files and store the path names in a ListBox, along with the operation type: creating or appending, and compression method. Once the Proceed button is pressed, the following code snippet will be executed:

ZipStorer zip;

if (this.RadioCreate.Checked)
    // Creates a new zip file
    zip = ZipStorer.Create(TextStorage.Text, "Generated by ZipStorer class");
    else
    // Creates a new zip file
    zip = ZipStorer.Open(TextStorage.Text, FileAccess.Write);

    // Stores all the files into the zip file
    foreach (string path in listBox1.Items)
    {
       zip.AddFile(this.checkCompress.Checked ? 
	ZipStorer.Compression.Deflate : ZipStorer.Compression.Store,
       	path, Path.GetFileName(path), "");
    }
}

// Creates a memory stream with text
MemoryStream readme = new MemoryStream(
System.Text.Encoding.UTF8.GetBytes(string.Format("{0}\r\nThis file
    has been {1} using the ZipStorer class, by Jaime Olivares.",
DateTime.Now, this.RadioCreate.Checked ? "created" : "appended")));

// Stores a new file directly from the stream
zip.AddStream("readme.txt", readme, DateTime.Now, "Please read");
readme.Close();

// Updates and closes the zip file
zip.Close();

This code snippet shows how to add both physical files and a little readme text from a memory stream.

Notice that the sample has been produced with Visual Studio 2008. The solution cannot be loaded directly with Visual Studio 2005, but a new solution can be created and the project file attached to it without problems.

Extracting Stored Files

To extract a file, the zip directory shall be read first, by using the ReadCentralDir() method, and then the ExtractStoredFile() method, like in the following minimal sample code:

// Open an existing zip file for reading
ZipStorer zip = ZipStorer.Open(@"c:\data\sample.zip", FileAccesss.Read);

// Read the central directory collection
List<ZipStorer.ZipFileEntry> dir = zip.ReadCentralDir();

// Look for the desired file
foreach (ZipStorer.ZipFileEntry entry in dir)
{
    if (Path.GetFileName(entry.FilenameInZip) == "sample.jpg")
    {
        // File found, extract it
        zip.ExtractStoredFile(entry, @"c:\data\sample.jpg");
        break;
    }
}
zip.Close();

Removal of Entries

Removal of entries in a zip file is a resource-consuming task. The simplest way is to copy all non-removed files into a new zip storage. The RemoveEntries() static method will do this exactly and will construct the ZipStorer object again. For the sake of efficiency, RemoveEntries() will accept many entry references in a single call, as in the following example:

List<ZipStorer.ZipFileEntry> removeList = new List<ZipStorer.ZipFileEntry>();

foreach (object sel in listBox4.SelectedItems)
{
    removeList.Add((ZipStorer.ZipFileEntry)sel);
}

ZipStorer.RemoveEntries(ref zip, removeList);

Files or Streams?

The current release of ZipStorer supports both files and streams for creating and opening a zip storage. Several methods are overloaded for this dual support. The advantage of file-oriented methods is simplicity, since those methods will open or create files internally. On the other hand, stream-oriented methods are more flexible by allowing to manage zip storages in streams different than files. File-oriented methods will invoke internally to equivalent stream-oriented methods. Notice that not all streams will apply, because the library requires the streams to be randomly accessed (CanSeek = true). The RemoveEntries method will work only if the zip storage is a file.

// File-oriented methods:
        public static ZipStorer Create(string _filename, string _comment)
        public static ZipStorer Open(string _filename, FileAccess _access)
        public void AddFile(Compression _method, 
		string _pathname, string _filenameInZip, string _comment)
        public bool ExtractFile(ZipFileEntry _zfe, string _filename)
        public static bool RemoveEntries
		(ref ZipStorer _zip, List<zipfileentry /> _zfes)  // No stream-oriented equivalent

// Stream-oriented methods:
        public static ZipStorer Create(Stream _stream, string _comment)
        public static ZipStorer Open(Stream _stream, FileAccess _access)
        public void AddStream(Compression _method, 
	string _filenameInZip, Stream _source, DateTime _modTime, string _comment)
        public bool ExtractFile(ZipFileEntry _zfe, Stream _stream)

Filename Encoding

Traditionally, the ZIP format supported DOS encoding system (a.k.a. IBM Code Page 437) for filenames in header records, which is a serious limitation for using non-occidental and even some occidental characters. Since 2007, the ZIP format specification was improved to support Unicode's UTF-8 encoding system.

ZipStorer class detects UTF-8 encoding by reading the proper flag in each file's header information. To enforce filenames to be encoded with UTF-8 system, set the EncodeUTF8 member of ZipStorer class to true. All new filenames added will be encoded with UTF8. Notice this doesn't affect stored file contents at all. Also be aware that Windows Explorer's embedded Zip format facility does not recognize well the UTF-8 encoding system, like it does WinZip or WinRAR.

Compatibility with ePUB & OCF

The ZipStorer library has been adjusted to comply with Open Container Format Specification (OCF), one of the standards required to produce ePUB Digital Books. There are some specific requirements to fulfill the OCF specification:

  • The storage shall have the .epub extension instead of .zip
  • The first file in storage must be non-compressed and shall be called mimetypes, containing the string application/epub+zip
  • Do not use comments in zip file entries or zip storage
  • The filenames shall be encoded in UTF8. Set the storage field EncodeUTF8 to true

Advantages and Usage

ZipStorer has the following advantages:

  • It is a short and monolithic C# class that can be embedded as source code in any project (1 source file of 33K, 700+ lines)
  • No external libraries, no extra DLLs in application deployments
  • No Interop calls, increments portability, maybe to Mono
  • Can also be implemented with .NET Compact Framework
  • Fast storing and extracting, because the code is simple and short
  • UTF8 Encoding support and ePUB compatibility

To implement this class into your own project, just add the ZipStorer.cs class file and start using it without any restriction. More recent updates can be found at my CodePlex page (zipstorer.codeplex.com).

History

  • November 23rd, 2007: First version
  • June 1st, 2008: Added append and extraction features
  • June 20th, 2008: Corrected some bugs in extraction portion
  • August 3rd, 2008: Corrected more bugs in extraction portion
  • October 3rd, 2008: Improved demo application with extraction code sample
  • August 22nd, 2009: Added compression capability
  • October 3rd, 2009: Added removal capability and other minor improvements
  • February 21st, 2010: Improved support to streams, and ePub compatibility
  • March 13th, 2010: Improved UTF-8 support and timestamp handling

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

Jaime Olivares
Architect Freelance (jaimeolivares.com)
Peru Peru
Member


Computer Electronics professional, Software Architect and senior Windows C++ and C# developer with experience in many other programming languages, platforms and application areas including communications, simulation systems, GIS, 3D graphics and mobile platform.
Also have experience in development of electronic interfaces, specially for military applications.
Currently intensively working with Visual C# 2010 and TFS.
Can be reached at jaimeolivares.com

Sign Up to vote   Poor Excellent
Add a reason or comment to your vote: x
Votes of 3 or less require a comment

Comments and Discussions

 
You must Sign In to use this message board.
Search this forum  
    Spacing  Noise  Layout  Per page   
GeneralRe: Thanks!memberJaime Olivares7 Jun '10 - 3:41 
Hi Karl,
I haven't found a way to modify DeflateStream's behaviour. I know it is adjusted for text compression, so other file format wouldn't have a compression ratio as good as other compressor tools.
Best regards,
Jaime.

GeneralWindows, WinZip and others behaviormemberKazna4ey15 Mar '10 - 14:04 
"Also be aware that Windows Explorer's embedded Zip format facility does not recognize well the UTF-8 encoding system, like it does WinZip or WinRAR."
 
I made some research on which encoding does Windows and other main archivers use and I found out they use default machine's OEM codepage.
You can get it with the following winapi function:
 
[DllImport("kernel32.dll", SetLastError = true)]
public static extern int GetOEMCP();
 
and then
 
int oemCP = NativeMethods.GetOEMCP();
var encoding = Encoding.GetEncoding(oemCP);
 
now we can specify this encoding to be used for filenames.
 
As I already said this behavior is default in Windows, WinZip, 7Zip and other main archivers.
 
If you decide to call GetOEMCP() winapi, don't forget about windows-independentness. I don't know if your class can run on MONO, so just in case...
 
If I were you, I would at least document this default OEM CP problem. Well, you may see no problem here, but for some languages it's a big problem. I am from Russia and unfortunately most of open-source zip archivers don't use default machine's OEM CP that results in unreadable names when archive is opened with Windows Explorer.
So, if I were you, I would consider giving user to choose encoding, maybe like this:
 
public void SetFilenameEncoding(FilenameEncoding filenameEncoding);
...
public enum FilenameEncoding
{
CP437, // default
UTF8,
MachineDefaultOemCP
}
 
Good luck,
With best regards,
Pavel.
GeneralThanks Jaime!memberHiAle12 Mar '10 - 16:34 
Hey Jaime,
 
Thanks a lot for your great class! Thumbs Up | :thumbsup:
I make a very small change: I added
uint crc32 = BitConverter.ToUInt32(CentralDirImage, pointer + 16);
[...]
zfe.Crc32 = crc32;
at line 288 because I need the CRC without extracting the file first.
 
Have a nice day!
GeneralRe: Thanks Jaime!memberJaime Olivares13 Mar '10 - 13:44 
Hi HiAle,
I will add this to the next version of ZipStorer, to be released soon.
Best regards,
Jaime.

QuestionTimestampsmemberRBJensen4 Mar '10 - 2:58 
Hi.
 
I am tempted to switch to this class instead of sharpziplib, but there are some lingering issues that must be fixed first.
First one is support for non-English characters - so far it seems to work, but I'll to do more testing on different PC's.
Second: I need the extracted files to be properly "timestamped" - could you consider adding that feature some day?
 
But thanks for a very interesting little class.
AnswerRe: TimestampsmemberJaime Olivares13 Mar '10 - 15:38 
Hi RB,
I have added the timestamp feature.
About non-English characters, european characters should work well at this moment, and also I am improving the UTF8 support.
Both will be released in next version of ZipStorer.
Best regards,
Jaime.

GeneralRe: TimestampsmemberJaime Olivares15 Mar '10 - 12:38 
Hi RB,
Please check this new release that includes timestamping feature and tell me.
I have improved UTF8 support as well (new section in article). Notice UTF8 filenames will not well handled by Windows' native zip support.
Please tell me about your findings in case you test it again.
Thanks in advance,
Best regards,
Jaime.

NewsMODIFICATION REQUIRED FOR UTF8 ENCODINGmemberJaime Olivares1 Mar '10 - 12:49 
Hello all,
If you are intending to use UTF8 encoding, please come back to this page soon because I will release a patch of the ZipStorer library.
Other features are working OK.
Thanks all.
Best regards,
Jaime.

GeneralRe: MODIFICATION REQUIRED FOR UTF8 ENCODINGmemberJaime Olivares13 Mar '10 - 18:48 
Please have a look to new release at zipstorer.codeplex.com
Soon to be released here at CodeProject.
Best regards,
Jaime.

GeneralZipStorer - A Pure C# Class to Store Files in ZipmemberFregate1 Mar '10 - 12:25 
Well done - I hope also that you do realize that since .NET 2.0 framework has native compression/decompression support?

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Permalink | Advertise | Privacy | Mobile
Web04 | 2.6.130523.1 | Last Updated 15 Mar 2010
Article Copyright 2007 by Jaime Olivares
Everything else Copyright © CodeProject, 1999-2013
Terms of Use
Layout: fixed | fluid