Click here to Skip to main content
Click here to Skip to main content

UnZipper: a File Extraction Tool

, 1 Aug 2007 CPOL
Rate this:
Please Sign up or sign in to vote.
UnZipper is a file extraction tool for decompressing files of a selected type/file extension out of a Zip or a compressed folder.

Screenshot - UnZipper.jpg

Introduction

This tool was developed as a solution to a repetitive task that I often face. While working in the GIS field, I often download lots of spatial data from the Internet, and when downloading aerials, they are normally in Zip files or compressed folders. The problem I was running into was that I might download 1, 3, 50, or hundreds of these Zip files to get the aerials for a certain location of a project. As you might have guessed, the extraction process for all of the aerials could take some time. In addition, most of the time, I was only interested in one file out of the zip. So, extracting all the files would be a waste of time and space. And, did I mention that this tool uses Java to perform the extraction process?

Background

I had searched for a little bit on the direction to take for extracting out files from Zip folders. But, I didn't really find anything that would do what I was needing. From what I had had seen, the System.IO.Compression just did not give me the functionality that I was looking for. After a little bit more 'Googling', I found this article "Zip and UnZip files in C# using J# libraries" by Mohammed Habeeb. He has presented a very easy and effective way to extract files. So, I built on what he had, fixed a bug, and updated some of the code. I did not add the Zip functionality. That will be for another day.

Using the Code

The code itself is straightforward and fairly easy to follow. I tried to write it in a way that makes it extremely functional, giving the developer that might use this class several options on how to implement the code. I will go over the basic parts here, and try to explain what I have done.

UnZipper Class

This class is the main class for the decompressing process. It is also were I use the Java code. Now, to clarify the Java code that I am talking about, it is from the Visual J# library and not the Java developed by Sun. By adding a reference to the vjslib.dll in our project, we now have access to VJ#'s libraries. I also add the following using directives:

using java.util;
using java.util.zip;
using java.io;

By using this library, I have a little more control over the decompression process. I can extract a file based on the file extension. Which is exactly what I wanted, meaning that, for most of the Zips, I only need a certain file type. The following method handles the extraction of a file:

private void UnZipFile(ZipEntry zFile)
{
    if (!zFile.isDirectory())
    {
        if (zFile.getName().Contains("." + opts.Extension))
        {
            InputStream iS = zipFile.getInputStream(zFile);
            FileOutputStream fileIO = new 
                FileOutputStream(opts.DestinationPath + "\\" +
                                 GetFileName(zFile.getName()));
            int len = 0;
            sbyte[] buffer = new sbyte[7168];
            while ((len = iS.read(buffer)) > 0)
            {
                fileIO.write(buffer, 0, len);
            }
            fileIO.close();
            iS.close();
            SetFileEvent(1);
        }
    }
}

Take notice of the method GetFileName(string). It is used to get just the file name and its extension, as in "shell32.dll". The need for this is: if the file is in a directory in the Zip file, the code reads the full path of the file as the file name. So, if you add the system32 folder to a Zip file, the code would read the file name of shell32 as "system32/shell32.dll" and that would be a bug. This was fairly easy to handle.

Now, let me step back for a moment and go to the main method of this class, MainMethod(). Catchy, I know Wink | ;) . This is basically the executing loop of the whole process.

private void MainMethod()
{
    GetListofZips();
    NumFilesEvent(zipFilesList.Count);
    foreach (string lF in zipFilesList)
    {
        UpdateFileNameEvent(lF);
        zipFile = new ZipFile(lF);
        GetListofZipEntries();
        foreach (ZipEntry zE in zipEntriesList)
        {
            UnZipFile(zE);
        }
        FileProcessEvent(0);
    }
    MessageBox.Show("Done");
}

First, it calls GetListofZips(), which will retrieve a list of all of the Zip files in the selected 'containing' folder.

private void GetListofZips()
{
    zipFilesList = new List<string>();
    DirectoryInfo dI = new DirectoryInfo(opts.ContainingPath);//contPath);
    foreach (System.IO.FileInfo fI in dI.GetFiles("*.zip"))
    {
        zipFilesList.Add(fI.FullName);
    }
}
</string>

The main method fires an event that sends the number of Zip files back to the form. This is done so the progress bar on the form can have its maximum units set. After this, the code begins to loop through each of the Zip files, calling the GetListofZipEntries() to create a list of everything that is in the Zip file (one Zip file processed at a time).

private void GetListofZipEntries()
{
    zipEntriesList = new List<zipentry>();
    Enumeration zipEnum = zipFile.entries();
    while (zipEnum.hasMoreElements())
    {
        ZipEntry zE = (ZipEntry)zipEnum.nextElement();
        zipEntriesList.Add(zE);
    }
}
</zipentry>

Once the list of entries is created, the UnZipFile(ZipEntry) is called and the decompression or extraction of the file is executed.

The extracted files are placed in a selected location. One thing to note is: you do not have to specify a file extension. Though, I will warn to the fact, if you decide not to supply a file extension, the code will extract every file out of every Zip file in the selected 'containing' path. That is something you should be aware of.

Points of Interest

You should also look at the form class as I have made use of multi-threading, delegates, and events to update the form with the application's process. These things maybe simple to the experienced developer, but not to someone starting out and trying to get a hold of how things work. See implementations of these different coding areas. One last thing to add, I used the built-in Settings option of the application to keep track of the different file extensions used. That means that, if you extract a file extension that was not in the list, it will be automatically added to the list.

History

  • Code was written July 2007.
  • Article was written and posted August 1, 2007.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

lost in transition
Web Developer
United States United States
I am a software, database, and gis developer. I love the challenge of learning new ways to code.

Comments and Discussions

 
Generalmissing vjslib reference Pinmembermarco13246598717-Sep-07 17:41 
GeneralRe: missing vjslib reference Pinmemberjason_lakewhitney18-Sep-07 4:22 
GeneralGreetings Pinmembergajatko6-Aug-07 12:18 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Terms of Use | Mobile
Web04 | 2.8.141220.1 | Last Updated 1 Aug 2007
Article Copyright 2007 by lost in transition
Everything else Copyright © CodeProject, 1999-2014
Layout: fixed | fluid