This article shortly explains some classes that give you the possibility to modify an archive without completely extracting it. The way I solved it is not really beautiful, but it works. I used a mix of the existing C# wrapper code for zlib (together with minizip) and some of my own code. A better way would be writing a wrapper class for libzip. And probably, the best way would be enhancing SharpZipLib so that it can modify archives. So feel free to work on that ;-).
Also note that these classes do not support enhanced settings like compression levels or archive comments because I didn't need them. But you could easily add these features.
Everyone who needs compression/decompression for C# most probably would use the SharpZipLib from IC#Code. I also did that. Until now. I have been using that library for compressing the data my program generates. Fortunately, the output was all of the same type, so I was able to save it in the same file. But now I have different types of data that belong to different parts of the program, so splitting up the data into files would be a good way to achieve this. I wanted to modify this archive in memory, because it is rather ugly to completely extract it. I've been searching around for a while, and I only found an old C# wrapper class from Gerry Shaws. I knew that SharpZipLib is no solution for me, so I downloaded it and tried to get it working. There were some bugs in it that caused some weird behaviour which vanished after I replaced all the unsafe parts with normal C# code (you have to now, Gerry's code is a little bit weird ;). And well, besides that, I adjusted the code heavily to my needs. I replaced his rather hard to handle
ZipWriter classes with the
ZipArchive classes, and that allows a way more easier way to read and write to an archive. It kind of hides the ugliness of the unmanaged zlib functions :P.
Using the code
I have made modifying of the archives (hopefully) very simple. You just need to instantiate an
Archive class that gives you access to an array of streams.
ZipArchive Archive = new ZipArchive("test.zip", FileAccess.ReadWrite);
byte Data = new byte[Archive["test.txt"].Length];
Console.WriteLine("Contents of 'test.txt'\n"+
Console.WriteLine("Contents of the new file:");
Data = System.Text.Encoding.Default.GetBytes(Console.ReadLine());
Notice that even if test.zip does not exist, you have to open it with
FileAccess.Write, and then a new archive will be created automatically. Also, when you write to an existing archive and the file is not in the archive yet, it will be created.
To see what is in an archive, you can simply enumerate your instance:
foreach(ZipEntry entry in Archive)
You can also check whether a file exists, or delete some:
Console.WriteLine("File test.txt is there");
bool success = Archive.DeleteFile("test.txt");
Console.WriteLine("File has been deleted");
else Console.WriteLine("File is not in archive");
- Can only create/modify .zip archives.
- Hard-coded settings (comment/compression level/etc.).
- No encryption.
- Most probably limited to 2 GB archives or even less.
Points of interest
The unlovely part of zlib together with minizip is the fact that you can't open an archive for reading and writing simultaneously. I hid that fact in my classes, and exposed only the
Write functions. But behind that, my class closes and reopens the archive whenever you switch from a read to a write function or vice versa. Second, zlib/minizip do not support any possibility to modify and/or delete files. But there is a link on the minizip homepage to a small example in C++ on how to delete files, written by Ivan A. Krestinin. I just converted that function to C#, and added it to my
ZipArchive class. So, whenever you write to an existing entry, the old file gets deleted first!
...go to the following people:
- The zlib and minizip developers.
- Gerry Shaw's C# Wrapper for zlib, on which my work is based on.
- Ivan A. Krestinin, for his example on how to delete files from an archive.
- Inserted the missing LGPL header.
- Fixed two severe bugs.