5,557,174 members and growing! (16,737 online)
Email Password   helpLost your password?
Desktop Development » Files and Folders » Files     Intermediate License: The Code Project Open License (CPOL)

Working with Archived files

By Harkos

Some time ago, I faced a different concept of file when I worked on a mainframe: one that holds multiple other files. Since then, I tried to reproduce this behavior in .NET.
C#, Windows, .NET 2.0, .NETVisual Studio, VS.NET2003, VS2005, Architect, Dev

Posted: 1 Nov 2004
Updated: 31 Oct 2004
Views: 30,496
Bookmarked: 12 times
Announcements
Want a new Job?



Search    
Advanced Search
Sitemap
6 votes for this Article.
Popularity: 2.72 Rating: 3.50 out of 5
2 votes, 33.3%
1
0 votes, 0.0%
2
0 votes, 0.0%
3
1 vote, 16.7%
4
3 votes, 50.0%
5

Introduction

Some time ago, I was sent to work with an old mainframe system. There I became familiar with a source repository where a single file on the system contains several files from the source code. Although this seems very familiar to many people (through TAR or ZIP files), I wanted the ability to work with the files within without copying them to a temporary location. This article is the result of this quest.

The Archive Class

I would be very unpractical to create such a class just to store files. It would work just like many other compression libraries out there – except for no compression at all. Plus, it would impose many limitations to store just files and no directories to organize them.

I created a simple file system based on what has been described as WinFS (Windows Longhorn FileSystem based on SQL). A single file contains a table with every file or folder inside the Archive. It's a simple record, so there's nothing more than a name, number of parent folder, entry points for files, or indexes for folders. I chose to record no dates or times. I also chose to use UNIX-style path separator to avoid escaped backslashes or verbatim strings.

In the end, it's a very simple class containing many of the functionality provided by File and Directory classes in the System.IO namespace, including methods like OpenRead and OpenText.

Why would I use it?

At first sight, many would consider a waste of time to use an Archive. And maybe you're right. The usage of every piece of technology depends on its need within the project. Maybe it's not your case.

At first, I designed an Archive to contain source codes for all of my projects. When it was near completion, I realized it wasn't such a great idea but found other uses for it such as small databases (specially to store serialized objects) and/or files I don't want users picking at. You can even find others I haven't thought of, let us know.

Limitations

As far as I can see, there are very few limits to expose:

  • An Int32 is used to index file blocks, so the archive size limit is 1,5Kb * 2,147,483,647 (= 3,221,225,470 Kb, theoretically). This is also the biggest a single file can get.
  • Directories differ from files by using negative start indexes, thus limiting an Archive to 2,147,483,648 directories. Currently, a stored index is used to retrieve the index for a new directory and this is never reset. (Should I revise it?)
  • Maybe others that I can't remember right now.

Expanded Universe

This is the goal of open-source, isn't it? So, the Archive class is made easy to understand, maintain and modify. In a few minutes reading the code, you'll be able to expand the class personalizing it to your own needs. Here are some examples that can be easily accomplished:

  • Replace the "Archive:" header for a more complex version detailing the use or meaning of the files within;
  • Simple scramble cryptography – OK, to use a crypto-stream, the file would require a temporary stream, but some simple cryptography (i.e., using XOR) is easy;
  • Additional file information like creation and modification times (which I chose not to include);
  • A new property to define which character is to be used for path separation;
  • etc.

Some of the modifications might even be useful to other users, so I suggest if you modify the Archive class, post a brief of the modifications you made here too.

Points of Interest

The code has been widely tested, except for files (as can be seen in the Main procedure), although the code used was tested in a previous version. So, there should be no problems with it. If there are any problems, let me know.

Also, the Archive class uses widely .NET Generic collections, limiting its use to .NET Framework 2.0. If anyone is interested in porting it to prior versions, I can post the results here.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

About the Author

Harkos


I'm that strange type, who likes to code (C# at all, but with a little bit of Delphi nowadays) and hates to use a database (I'd rather code it, instead).

Although I don't place any form of restriction upon using the codes I provide, I would appreciate to be mentioned as it's author in any projects using them.
Occupation: Software Developer
Location: Brazil Brazil

Other popular Files and Folders articles:

Article Top
Sign Up to vote for this article
You must Sign In to use this message board.
FAQ FAQ Noise ToleranceSearch Search Messages 
 Layout  Per page   
 Msgs 1 to 9 of 9 (Total in Forum: 9) (Refresh)FirstPrevNext
Subject  Author Date 
QuestionAppending files to the archivememberallanhaugsted11:23 19 Sep '05  
AnswerRe: Appending files to the archivememberHarkos2:59 23 Sep '05  
GeneralRe: Appending files to the archivememberscottcurrier13:37 10 Dec '05  
AnswerRe: Appending files to the archivememberHarkos1:07 12 Dec '05  
GeneralMultiple streams using NTFSmemberSteven Campbell9:15 1 Nov '04  
GeneralRe: Multiple streams using NTFSmemberHarkos9:36 2 Nov '04  
GeneralMisnamed article and why do this?memberDale Thompson4:34 1 Nov '04  
GeneralRe: Misnamed article and why do this?memberHarkos7:42 1 Nov '04  
GeneralRe: Misnamed article and why do this?memberjdraper37:37 8 Aug '05  

General General    News News    Question Question    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

PermaLink | Privacy | Terms of Use
Last Updated: 31 Oct 2004
Editor: Sean Ewington
Copyright 2004 by Harkos
Everything else Copyright © CodeProject, 1999-2008
Web17 | Advertise on the Code Project