![]() |
Web Development »
Web Services »
General
Intermediate
IFS - an Internet File System implementation based on Web services and peer-to-peer technologyBy Stoyan DamovInternet File System from scratch - making web services and peer-to-peer technology work together to build a virtual file system |
C++/CLI, VC7, Windows, .NET 1.0, Dev
|
|
Advanced Search Add to IE Search |
|
|
|
||||||||||||||||
You've seen Napster, Gnucleus, Morpheus and the like file-sharing applications. If you haven't, I'm sure you've heard about the exciting peer-to-peer (P2P) technology. I don't doubt you've seen the big hype about web services, but you probably haven't seen anything smarter than an HTML screen-scrapping web service, or one, exposing some proprietary technology or software to the web-aware public.
This article demonstrates a virtual file system, based on P2P and web services, which in all modesty I call "The Internet File System" or IFS. Unfortunately, big articles look bad, so I'll show you how to use the library I've written, and if you are smart, (and I know you are) you'll get the rest from the source code. Have a pleasant reading...
The Internet File System is an imaginary file system. A real Internet file system could hardly exist, because of the variety of hardware and software platforms (OSes) scattered around the world. The IFS simulates a file system by having a central repository of folder and file links (shortcuts) published by a multitude of computers, known as peers. Peers share files and folders in the repository, but do not use it to actually store the files contents. They communicate to each other instead, to download (copy) the files between themselves.
In this simple implementation, the repository is managed by the peers via a web service (IFSWS), and the peer-to-peer communication is handled by a P2P framework library (P2PLIB). Because of the complexity to use directly the web service and the P2P framework, I've built a stand-alone P2P server (P2PSRV), which runs on the peer computers and handle all the P2P communication automatically. Furthermore, to hide the details of the IFS implementation, I've built a Internet File System library (IFSLIB) on top of IFSWS and P2PLIB, that exposes an easy to use object model for manipulation of the central repository, as well as easy copy (download) of files between peers.
As you can see on the diagram above, there are two big parties here: the IFS Web Service, and the peers:
It is only responsible for storing and retrieving information - it just serves as a repository. Actually, it registers/unregisters/logs in peers, publishes folders and files and retrieves vital information about the peers, etc., but the coolest part of IFS is done by the library.
I'm not a native English speaker, I'm Bulgarian. That's why I cant explain exactly what a peer is. In my dictionary, the word "peer" has just one meaning as a noun, and it is not "a PC connected to the Internet, that can communicated directly whith other PCs, using some communication protocol". However, that's the meaning of peer I've put in this article.
There are 3 things that make the dumb PCs peers - these are the P2P library, the P2P server, and the IFS library. The IFSLIB works on top of the web service, and the P2P server on top of the library.
One can use the IFS library to build any kind of P2P application - GUI-based Windows Explorer-like IFS browser, console tools for automatic/scheduled download of files, etc. I started working on an IFS browser, but due to the limited time (I started this article to enter the web contest) could not finish it. However, I included the crippled demo in the source code, so anyone who likes the idea (and can code in C#) could finish (or rewrite) it. (Now, as I am updating the article, two weeks later, the status of the IFSBrowser is unchanged. That's because I thought that fixing bugs and adding features to the library has a bigger priority at least for me.)
Well, you have 3 scenarios when you have to copy a file from one peer to another. To avoid tautology, lets name the interested peer (which wants to download a file) P1, and the owner peer -- P2.
No problem. P1 pulls the file from P2
Ooops! P1 can't connect to P2 just that easy. So what can it do?
It can execute a web service method that will assign a task to P2
to push the file to P1. The P2P server on P2's PC is scheduled
to retrieve its tasks, using the web service. When it gets a push
task, it connects to P1 and pushes the file. Is this easy? Yes, and I have
almost implemented this feature.
Yeaah! What shall we do now? Well, if P1 could assign an upload task to P2, the latter could upload the file, using the web service to the web service's server PC. When P2 is done, it could assign a download task to P1 and the latter will download the file! Voila! This feature is not implemented at all but is very easy to implement. However, because I'm pressed by a DEADline sooooo badly, I don't promise that I'll implement it for the next week...
How works what?:) OK, I'll try to focus and tell you how the library works. I won't discuss how the basic functionality works, i.e. peer registration, log in, folder and file publishing, etc. These are just entries in the database via the IFS web service. I'll show you how a typical file download works in both of the easier scenarios, i.e. the "no firewalls" and "owner is behind firewall" situations. And because I know you're SMART guys (and I'm so lazy:) I haven't drawn any pictures, so read along...
I'll use P1 and P2 again for the interested and the owner peer. P1 has a
IfsFile instance (see library objects below)
and calls the Download method. The IfsFile object has a
OwnerPeer property, which returns an IfsPeer instance. The
latter has an IpEndPoint property, containing the IP address and port of
the remote peer. At this moment, the IFS library sends a "library pull" command to
the P2P server, running on P1's PC. This command means the the library wants the
server to pull a file from a remote peer. The P2P server gets the remote peer's end
point from the command, as well as the file name P1 wants to pull and the folder in
which to download the file and then sends a "peer pull" command to the
remote peer (P2). P2's P2P server accepts the command, gets the file size and CRC
(CRC is not implemented) and sends them back the P1. P1 now knows exactly how many
bytes to accept and starts to receive the bits P2 sends and write them in the destination
folder. After P2 has sent all the bytes and P1 has received them, P1 sends an
"OK" response (or "ERROR" if something goes wrong) to P2 and
closes the connection. That's pretty much what a download is.
P1 assigns a "push task" to P2 via the IFS web service and "thinks" that the file has been downloaded :) At some time, P2's P2P server reads its task via the IFSWS and sees it has a "push task". P2P issues to itself (the task reader is in another thread) a "library push" command. As the peer command listener thread receives the command, it gets the file size and CRC (CRC again is not implemented) and forms a "peer push" command. P2 connects to P1 (the task contains P1's end point) and issues the "peer push" command and starts to stream the file. P1 gets the command, receives the file and stores it in the destination folder, indicated by P2. (P2 knows the destination folder, because P1 has sent it in the "push task".) That's all.
There's a long way to go until you actually start using the library, but we'll get to that stuff (Compiling and configuring IFS) soon. Now I just want to show you how easy it is to use IFSLIB.
The library consists of only 4 classes, that hide everything about
peer-to-peer and web services stuff: IfsSystem, IfsPeer,
IfsFolder and the IfsFile.
In order to use IFS (after it is already set up), you should register in
the repository as a peer. There's nothing easier than that, like you'll see
in a moment, but I'll warn you something first: unless you have registered or
logged in as a peer with IFS, you won't be able to use the IFS's most
important object - the IfsSystem object. It is implemented as
singleton to avoid having multiple instances of IFS peers on the same peer
computer. If you actually try to use even the simpliest property of the
IfsSystem class, you'll get a runtime exception, stating that
you haven't logged in/registered with IFS.
First, you have to register as a new peer:
// first, you'll have to get a "handle" of the IFS singleton object
IfsSystem __gc* ifs = IfsSystem::Instance;
// now you can register
IfsPeer __gc* peer = ifs->RegisterPeer (
S"Stoyan Damov", // alias
S"Stoyan", // login name
S"Secret", // password
S"BG", // country code (unused in this version)
false); // behind firewall or NAT?
Of course, you may not register more than one time in IFS (unless you're prepared for exceptions), so once you have registered, the next time you should sign in:
IfsPeer __gc* peer = ifs->LoginPeer (S"Stoyan", S"Secret");
// in fact you can throw the peer away, you won't use it for
// anything, except for examining its properties
A peer (IfsPeer instance) has the following properties:
Once you've registered or logged in, you can start using the
IfsSystem object's properties, the most important (and
usable) of which is the RootFolder property, which returns an
IfsFolder object, representing the virtual root folder.
You can get the root folder just that easy:
IfsFolder __gc* root = ifs->RootFolder;
Each folder (including the root one has several properties):
IfsFolder instanceAfter you get to the root folder, there are many things you can do with it:
Actually you can perform these actions with all folder objects you get a pointer to, and you can very easily get an arbitrary folder like this:
IfsFolder __gc* folder = IfsFolder::GetFolderByPath (S"./Docs/PDF");
Below, I am giving some examples of the fore-mentioned operations,
and once you've seen them, you can move to some useful static methods of
the IfsFolder class.
You can publish a folder into an existing one:
IfsFolder __gc* subFolder = root->PublishFolderHere (
S"VirtualFolderName",
S"Description", // may be omitted in an overload
S"c:\\physical\\folder\\path");
// and for more fun:
IfsFolder __gc* subSub = (root->PublishFolderHere (
S"Folder",
S"Description", // may be omitted in an overload
S"c:\\physical\\folder"))->PublishFolderHere (
S"SubFolder",
S"c:\\physical\\folder\\subFolder");
or publish a brand new folder:
// when you know the destination path
IfsFolder __gc* folder = new IfsFolder (
S"VirtualName",
S"Description", // may be omitted in an overload
S"c:\\physical\\folder\\path");
folder->Publish ("./target/virtual/path");
// when you have the destination folder object
folder->Publish (targetFolder);
The lazy guys (this includes me) can publish folders using the static methods:
IfsFolder __gc* folder = IfsFolder::PublishFolder (
S"VirtualName",
S"Description", // may be omitted in an overload
S"c:\\physical\\folder\\path",
S"./target/virtual/path");
Oooh, I forgot to tell you how do you rename a folder
// I assume you got one already
folder->RenameTo (S"NewFolderName"); // wow! how difficult :)
There may (or may not) exist other methods (either static or instance ones) for publishing a folder, but I think these were enough to show you how easy it is done. Now, its time to see what you can do with the published folders:
You can find sub-folders:
// the statement below will return all folders, arbitrary level
// below the "folder" one, which name is "docs" (recursively)
ArrayList __gc* folders = folder->FindSubFolders (S"docs");
or get all folders:
// this statement will return all folders below the "folder" object
ArrayList __gc* folders = folder->GetFolders ();
or even find folders in the entire IFS:
ArrayList __gc* folders = IfsFolder::FindFolders (S"docs");
// the above is equivalent to:
ArrayList __gc* folders = root->FindSubFolders (S"docs");
In the previous version of the article, I forgot to write a lot of
things about the files. I forgot to tell you that a file (IfsFile)
has some useful properties:
IfsPeer object, that owns the file:)IfsFolder object, where the file residesIfsFile instanceAgain, in the previous article, I mentioned that the IfsFolder has
several instance and static methods to publish a folder and a file. Do you know why
the folder should publish a file, and not a file publish itself? Because I was stupid.
I was not able to use the IfsFolder class in the
IfsFile one, because I would create a cyclic header include. Every C++
programmer knows s/he should not include the header, but rather just declare the class
in the header like class __gc* IfsFolder;. That's what I did then, but it
didn't work and I thought that either I suck, or Visual C++ does. Well, I suck, but
let me tell you why. I forgot that all classes in the IFS library were wrapped in two
namespaces. That's why either I should have written class __gc* IfsFolder;
inside the namespaces of the "IfsFile.h" header, or wrap the declaration
in the namespaces, like this: namespace InternetFileSystem { namespace Library { public __gc class IfsFolder; }}.
So, that's what I did, and now the IfsFile class has six instance or
static methods for publishing.
You can create a brand new file like this:
IfsFile __gc* file = new IfsFile (
S"fileName",
S"file description", // may be omitted in an overload
S"x:\\full\\path\\to\\fileName");
And publish it like that:
// a) calling the static Publish method (laziest)
IfsFile __gc* file = IfsFile::Publish (
S"fileName",
S"file description", // may be omitted in an overload
S"x:\\full\\path\\to\\fileName",
S"./ifs/target/path/");
// b) calling another static Publish method (you should have an
// IfsFolder before that) assuming you have the targetFolder,
// which is an instance of the IfsFolder class
IfsFile __gc* file = IfsFile::Publish (
S"fileName",
S"file description", // may be omitted in an overload
S"x:\\full\\path\\to\\fileName",
targetFolder);
// c) You have a brand new file and you want to publish it
file->Publish (S"./ifs/target/path/");
// d) You have a brand new file and an IfsFolder instance
// (targetFolder);
file->Publish (targetFolder);
You can get an IfsFile object in several ways:
// get the file (assuming you have the folder already)
IfsFile __gc* file = folder->GetFile (S"readme.txt");
// get a folder's files
ArrayList __gc* files = folder->GetFiles ();
// or search in the whole IFS for a given file
ArrayList __gc* files = IfsFolder::FindFiles (S"readme.txt");
// now you get a file like this:
IfsFile __gc* file = static_cast<IfsFile __gc*> (files->get_Item (0));
The typical scenario is to download a file from a remote peer:
// this may not happen instantly
file->Download (S"c:\\local\\folder");<strike>
Now, a file's folder is just its property Folder.
IfsFolder __gc* folder = file->Folder; // easier, I think :)
And finally, guess how a file is renamed... I'll leave it to
your imagination, but the method should look like RenameTo:)
There are more instance and static methods of the
IfsFolder and the IfsFile classes but
you can see and learn them by browsing the source code.
Well, I tried to write a big example of how you can use the library. It is (will be) a fully fledged Windows Explorer-like IFS browser, and I called it "IFS Browser" :). I ran out of time, so I couldn't finish it, but I've implemented the basic functionality:
You can implement all the other features in a couple of hours, believe me! However, I'll implement them next week, so if you can wait, you'll get everything for free. Below is a screenshot of the IFS Browser in action:

If you own a copy of Visual Studio .NET you don't have to do much than compiling the solution file. But if you don't own one, please, do your self a BIG favor and buy it, otherwise you'll have to wait a week, until I finish version 2 of this article and explain the manual command-line compilation.
Open the solution file in VS .NET and build it. This step will produce the following binaries in the Bin folder:
Except to change the ConnectionString setting, explained below I can't remember anything else. There's a configuration file, that will be created automatically by the P2P library (and the IFS library will add your IP address to it), where you can change some parameters to fit your needs:
I've chosen Microsoft SQL Server for the back-end of the IFS Web Service, because:
The IFS database is so simple, that a simpler database could hardly exist. Just look at the picture below and see why:
I think I shouldn't explain anything here, should I?
Now, to setup the database you just have to run 1 SQL script in your favorite tool (osql, isql, isqlw [Query Analyzer]). The script's name is InstallDatabase.sql and resides in the Database folder of the zipped source code. It will create a database, called "IFS", its tables and stored procedures. NOTE: you'll have to edit the InstallDatabase.sql script and modify the physical location of the database, because I had no time to even write a simple parameterized batch file. That's it. In order for the IFSWS to work with the database, you have to change the ConnectionString setting in the <appSettings> section of the Web Service's web.config file. I guess that's all you should do. To uninstall the database, run the UninstallDatabase.sql script or manually drop the database (which is what the script does).
I've read and learned the "Managed extensions for C++" specification and the migration guide as soon as Microsoft released them to the public. However, I am a full-time developer and I don't have the time to play around with MC++, because of the "mission impossible"-deadlines, because I study a lot more stuff (e.g. preparing for 2 MCSD .NET exams, learning HLA, MASM, ATL/WTL 7, ATL Server, etc.) and last, but not the least, I have to pay attention (or whatever you call it :) to my wife. So I just wanted to practice MC++, and believe me, it was quite unpleasant to switch from C# (daily job) to MC++ (nightly fun) and vice-versa. IFS is my first > 100 LOC MC++ work (actually it is >5000 LOC), and I'm thankful to Chris Maunder for setting up the Web development contest, helping me practice MC++.
In my opinion, MC++ is no more powerful than C# if you only use the
.NET Framework and the managed extensions (w/o IJW and unmanaged code). It is
actually slower to write MC++ code and you will forget the __gc*
quite often on your first 2,000 MC++ lines of code. Furthermore, you will get
sick with Microsoft's pervert syntax for functions, returning managed arrays,
like: unsigned char ReturnsManagedCharArray () __gc[];.
However, VC++ .NET is the best choice for writing either managed or unmanaged
applications, because you actually have two languages, and an arsenal of SDKs.
Well, the thoughts I'm about to share, are not big tricks for those, who already have experience with the .NET framework and MC++, but I know there are some guys, which will appreciate them, and I wrote these, before I wrote this paragraph:) You may also want to know that I've hidden a MC++ compiler bug in the text, so keep reading...
NetworkStream and a couple of other
classes. It was no fun, and I felt damn stupid when I saw them in the docs.
Do not be tempted to re-invent the wheel. Don't excuse yourself, saying you
have no time to read and learn everything (like I do:). You have to. It will do
you only good, believe me.__gc classes's destructors with the
public modifier, or you'll see funny dtor() methods in
the C#'s IDE. And then, learn the usage of two visibility modifiers (e.g.
private public to hide the internal (assembly) methods from the
the public.__property keyword infront :). Then, remember that
__property wants to stay before the static keyword.
Fate.String __gc* object (I guess only if you
have turned string interning ON)switch [the switch is faster than
reflection though]).gotos when other
techniques will kill you. (Imagine you want to check N conditions in a
try-catch-__finally block and if the conditions are not met, you want to
exit from the try block, but execute some code after the __finally block.
What will you do, huh? Have N ifs nested? Invent a
break_block keyword? :) I even saw Jeff Richter using
goto to exit to the end of a try-catch block.static_cast to unbox enums.
Do not use dynamic_cast or __try_cast or the
compiler will crash. I though I found this bug first, but once I posted it,
I learned that someone else found it a month ago :)I dreamed several months ago (don't laugh) that I've invented a new programming language (as if the current are not enough) and in my dream I named it "p". I guess it has C++-like syntax, but it was very strange, because it had no control statements like do/while/for, etc. Instead, it had built-in algorithms (like those in STL) that fit every case in the world:) And one of the MOST cool features was that "p" could throw exceptions and warnings! In fact, I think high-level languages like C#/VB/Java... deserve such a feature. Just imagine you have a method that expects some parameters, examines them but decides to do its job in a more efficient way, ignoring the parameters you've passed to it. It could throw a warning, and you could catch it only if you are interested in it, like this:
void IntelligentMethod (int someHintValue)
{
int aBetterValue = CalulateBetterValue (someHintValue);
if (aBetterValue != someHintValue)
warn[b] (new Warning (S"Ignoring someHintValue"));
// ...proceed with aBetterValue
}
Yes, I know it could be done very easy with events, but its just not the
same, just as typing op_Equality is not the same as typing
==. And really, I miss HRESULTs! We don't have a severity,
facility, etc. We don't even have some code. I know I could write my own
ApplicationException-derived exception with code, etc. but the inevitable
switch on the code will suck and will break the idea of catching the right
(and expected exceptions) like:
catch (MyException) { /* handle it */ }
catch (YourException) { /* ditto */ }
In fact, this could be achieved in the following way:
// the base exceptions
public __gc class CriticalSeverityException : public Exception { ... };
public __gc class MediumSeverityException : public Exception { ... };
public __gc class LowSeverityException : public Exception { ... };
// the specific ones
public __gc class OutOfDiskSpaceException : public CriticalSeverityException { ... };
public __gc class AccessDeniedException : public CriticalSeverityException { ... };
public __gc class BusinessLogicException : public MediumSeverityException { ... };
// the one below is something like a warning...
public __gc class NearQuotaLimitException : public LowSeverityException { ... };
and now, we can handle the exceptions in the following manner:
try
{
// do something, throwing exceptions
}
// this will catch both OutOfDiskSpace and AccessDenied exceptions
catch (CriticalSeverityException __gc* e) { /* whatever */ }
catch (BusinessLogicException __gc* e) { /* catch specific one */ }
// catches all LowSeverity exceptions
catch (LowSeverityException __gc* e) { /* ... */ }
__finally { /* ... */ }
I'd really like to see one day a construct like this (or not exactly like this):
try
{
// do something wich throws exceptions and raises warnings
}
catch_warning (SpecificWarning __gc* w) { /* handle warning */ }
catch (Exception __gc*) { /* ... */ }
But enough. I must have lost my mind :) If you have some thoughts, share them
with me, and please comment on this one. I really want to know if someone else
thinks that ANY language needs warnings.
I don't even want to start this section, but I have to. I want to share with you what I wanted to put in IFS, but as it was developed for the contest, I had no time. I will (eventually) add many more features once I have some free time (which is never), but for those enthusiasts, who want to improve on it, here's the (LONG) list:
Attributes, FileAttributes and FolderAttributes.
I though that it would be very stupid to expand the Files and
Folders tables for some properties like size, author, last
accessed, blah blah, etc. So I initially designed the tables to support
attributed files and folders, but haven't implemented them in the IFS web service
(though I've implemented the prototypes in the IFS library). So it will be
cool to implement them one day (may be the day before I retire :))Vigorous writing is concise. A sentence should contain no unnecessary words, a paragraph no unnecessary sentences. [William Strunk Jr.]
Because I'm faaaar away from a vigorous writer, thank you for reading the article! It was my
first article and I found that I write code 10 times faster than plain text :)
Frankly, I envy the famous book writers - they should be really smart guys! Now,
about the article: I hope you saw how easy it is to use the IFS library. If you
examine the source code, coming with the article, you'll see how easy is
to implement a simple Internet File System. I've written it in a week
in two weeks (now it is updated) in my spare time (which is my sleep time), and I'm
not a typist, so you can do it for even lesser time. That's .NET - a RAD framework
for today's rushing world. Bugs happen, and they happen even at Microsoft, but you
shouldn't let that stop you from learning and practicing this new exciting
technology, which in my opinion will rule the development world in an year
or so (tell me frankly, have you ever seen a technology producing more than
150 books in less than 6 months? I haven't.)
Yup, they exist. And they bite:) As of this writing, there are no bugs in IFS (at least, I don't know any). However, there's one 100% Microsoft's bug: it is either in the ImageList or in the Resource manager or in the ToolBar class. You put some images in an image list, you set the image list to a toolbar, set the appropriate image indices to the toolbar buttons and you expect them to show up, right? Wrong! Either they won't show up at all, or one of them will show up everywhere! That's why I distribute the icons for the toolbar, and place them on the toolbar with code. You should copy IFSBrowser\Resources\*.ico to the Bin folder, or the IFS Browser will crash. In the previous edition of the article, I said there's a third bug, concerning the exception handling. I kind of fixed that, and I added some meaningful exceptions that the IFS library throws around :)
The lack of documentation is a BIG bug. I promissed I'll make one, but unfortunately right now, I'm under big pressure in work, so I'll generate the .CHM help in the next version of the article. Sorry!
Send all other bugs (and cheers :) to stoyan_damov@hotmail.com. I'll be more than glad to fix them. However, if you fix a bug, please send it to me (plus the fix, please)! Thanks!
I hope you haven't read the previous version of the article. Here's why:
ArgumentNullExceptionFolder property to the IfsFile classMoveTo method(s) to IfsFileIfsFileNetHelper classThe software comes �AS IS�, with all faults and with no warranties. Please, take the best disclaimer from any open source license, read it and memorize it. FREE software = NO WARRANTY :) However, I grant you the full rights to do ANYTHING with the source code (except sue me for it:), and the only thing I want is to thank me in your mind:)
General
News
Question
Answer
Joke
Rant
Admin
|
PermaLink |
Privacy |
Terms of Use
Last Updated: 28 Sep 2002 Editor: Chris Maunder |
Copyright 2002 by Stoyan Damov Everything else Copyright © CodeProject, 1999-2009 Web12 | Advertise on the Code Project |