Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Sending Files in Chunks with MTOM Web Services and .NET 2.0

0.00/5 (No votes)
23 Nov 2007 395  
How to send large files across web services in small chunks using MTOM (WSE 3)
Screenshot - Article.png

Introduction

In trying to keep up to speed with .NET 2.0, I decided to do a .NET 2.0 version of my Code Project article DIME Buffered Upload, which uses the DIME standard to transfer binary data over web services. The DIME approach is reasonably efficient, but the code is quite complex and I was keen to explore what .NET 2.0 had to offer. In this article, I use version 3.0 of the WSE (Web Service Enhancements), which is available for .NET 2.0 as an add-in, to provide a simpler and faster method of sending binary data in small chunks over HTTP web services.

Background

Just a re-cap on why you may need to send data in small chunks at all: if you have a large file and you want to send it across a web service, you must understand the way it all fits together between IIS, .NET and the web service call. You send your file in an array of bytes as a parameter to a web service call, which is all sent to the IIS web server as a single request. This is bad if the size of the file is beyond the configured MaxRequestLength of your application or if the request causes an IIS timeout. It is also bad from the point of view of providing file transfer feedback to the user interface, because you have no indication of how the transfer is going until it is either completed or failed.

The solution outlined here is to send chunks of the file one-by-one and append them to the file on the server. There is an MD5 file hash done on the client and the server to verify that the file received is identical to the file sent. Also, both upload and download file transfers are included in this article.

Adventures with MTOM

MTOM stands for SOAP "Message Transmission Optimization Mechanism" and it is a W3C standard. To use it and to run this application, you must download and install WSE 3.0, which includes MTOM support for the first time. If you look in the app.config and web.config files in the source code, you will see sections referring to the WSE 3 assembly and a messaging clientMode or serverMode setting. These are necessary to run MTOM in the application.

The problem with DIME is that the binary content of a message is sent outside the SoapEnvelope of the XML message. This means that although your message is secure, the DIME attachment may not be secure. MTOM fully complies with the other WS-* specifications like WS-Security, so the entire message is secure.

It took me a while to realise that when MTOM is turned on for the client and the server, WSE automatically handles the binary encoding of the data in the web service message. With DIME and WSE 2.0, you had to code your application for DIME by using DimeAttachments. This is no longer necessary; you just send your byte[] as a parameter or a return value to WebMethod and WSE makes sure that it is sent as binary, not padded by XML serialization as it would be in the absence of DIME or MTOM.

The User Interface

The client application is fairly straightforward, intended only as a demonstration of how to use the class library and not as a production application. There are options at the top of the form to indicate if you want the file hash check done at the end of the transfer. You can also manually set the chunk size or you can tick the box to AutoSet to let it regulate itself. Any connection error messages are displayed in red below the options. The files are transmitted concurrently, so you can reduce the number of threads available (and thus concurrent transfers) via the NumericUpDown control. This works because I use ThreadPool.QueueUserWorkItem() to manage the multithreading. Note that any changes to the number of threads will not take effect after you have begun the transfer because the threads have already been queued.

To upload one or more files, simply click the Upload button and you're away. A progress bar and status message will appear for each file transfer, disappearing as soon as each file is transferred successfully. If there are errors, such as a file hash difference, the file is left on-screen with the error message. The panel for downloading files is similarly easy to use. It has a list box showing all of the files in the Upload folder on the server; there probably won't be any there by default. Just select the files you want to download (drag a window or Ctrl+click) and click the Download button. You can refresh the files list at any time with the refresh button. To change the save folder, enter a new path in the text box provided.

A Forms-authenticated Web Service?

You can also tick the box for "Login Required" if your website is configured with forms authentication. Normally, you can't use a web service if it is protected by forms authentication. This is because forms authentication is performed via a login ASPX page and an authentication cookie is given to the client browser. These conditions are not web service friendly.

There is a work-around to allow you to protect the web service via forms authentication. It sends HttpWebRequest to the login.aspx page and captures the cookie, placing it in the web service objects. Look in the code-behind of login.aspx.cs to see the minor modification that was needed to accept a login via a query string, i.e. HttpWebRequest. To use forms authentication, just change the authentication section of web.config, which is set to Windows authentication by default. Then you will need to enter a username and password -- admin/admin by default -- in the client application to upload or download a file. The client application will auto-detect if forms authentication is required. If it is, it will tick the box for "Login Required" and focus the username field.

How the Code Works

The web service has two main methods: AppendChunk is for uploading a file to the server and DownloadChunk is for downloading from the server. These methods receive parameters for the file name, the offset of the chunk and the size of the buffer being sent/received.

The Windows Forms client application can upload a file by sending all of the chunks one after the other using AppendChunk until the file has been completely sent. It does an MD5 hash on the local file and compares it with the hash of the file on the server to ensure that the contents of the files are identical. The download code is very similar, the main difference being that the client must know from the server how big the file is so that it can know when to stop requesting chunks.

A simplified version of the upload code from the Windows Forms client is shown below. Have a look in the code for Form1.cs to see the inline comments and the explanation of the code. Essentially, a file stream is opened on the client for the duration of the transfer. Then the first chunk is read into the Buffer byte array. The while loop keeps running until the FileStream.Read() method returns 0, i.e. the end of the file has been reached. For each iteration, the buffer is sent directly to the web service as a byte[]. The SentBytes variable is used to report progress to the form.

using(FileStream fs = new FileStream(LocalFilePath, 
    FileMode.Open, FileAccess.Read))
{
    int BytesRead = fs.Read(Buffer, 0, ChunkSize);
    while(BytesRead > 0 && !worker.CancellationPending)
    {
        ws.AppendChunk(FileName, Buffer, SentBytes);
        SentBytes += BytesRead;
        BytesRead = fs.Read(Buffer, 0, ChunkSize);     
    }
}

Setting the Chunk Size

In many Windows Forms applications, regular feedback to the user is very important. Having a responsive and visually communicative application is usually worth a small sacrifice in performance. Feedback for file transfers is typically done via a progress bar and/or status bar message. Obviously, the web services aspect to a chunked file transfer is overhead. The client constructs and sends the SOAP message and then the server receives and parses it before sending the response. If the chunk size is very small, i.e. 2 KB, then there is a lot of messaging going on and not much data transfer.

It should be clear then that we should aim for the highest possible chunk size that is within our requirements for quick user interface feedback. I have aimed for each chunk to be completed in 800 milliseconds. You can adjust this setting programmatically before the file transfer. See the PreferredTransferDuration variable in the file transfer object. The client regulates the chunk size automatically, to ensure that each chunk is completed in the desired time. This is done by sampling every 15th chunk -- also adjustable, ref ChunkSizeSampleInterval -- and adjusting the chunk size based on the time it takes to transfer this chunk.

The overall result is a self-controlled file transfer that will adapt to changing network conditions during the transfer. One useful feature is that the web service provides the MaxRequestLength setting on the server, which the client retrieves before the transfer in order to stay within acceptable request sizes on the server.

Resume Transfers Supported

The application also supports resuming a failed upload or download. I use this application to copy a system image (20+ gigs) from my web server to my home PC. Obviously, there is a good chance of a connection being dropped during such a lengthy transfer, even with a 3 Mbit bidirectional line. Resume support is a must when dealing with such large files and it is very simple to include resume support in this application.

Because data is only written to the file after it has been successfully received, we can be confident in resuming a file transfer based on the size of the partial file. In theory and in practice, this works perfectly. Click File > Resume Upload or File > Resume Download, etc. to locate the partial file you wish to resume. After the file transfer, an MD5 hash is requested from the server and the client to compare the files. The timeout value is increased for the hash check, but it is always possible that it will timeout when checking such a large file. So you might have transferred the correct number of bytes, but have no way of verifying that they are the right bytes.

I have included a manual MD5 hash check, available in the File menu. If the server is not giving the file hash within the timeout limit, you could run the client application directly on the server and check it locally, thus overcoming the timeout problem.

Incorporating into Your Own Application

To use this code in your own client application, simply add a reference to MTOM_Library.dll, which is included in the article download. Then you should use the example client application as a starting point. It shows how to use the classes provided in order to perform the file transfer. Refer to the UploadFile and DownloadFile methods of Form1.cs in particular.

You use the FileTransferUpload class to do an upload and the FileTransferDownload class to do a download. Each class has settings such as ChunkSize, AutoSetChunkSize, LocalSaveFolder, IncludeHashVerification, etc. You can configure these as needed in your code. To change the location of the web service, make sure that the app.config file is deployed to your run directory. Change the MTOM_Library_MtomWebService_MTOM setting to the location of your web service.

Your ASP.NET application will also need to host the MTOM.asmx web service and the Login.aspx code-behind if you use forms authentication. To configure the web application for MTOM, just copy the settings from the web.config file included in this application. You can change the Upload folder on the server in web.config to either a relative path or absolute path. If you set a relative path, the server must be able to MapPath() to the folder. Files are downloaded from and uploaded to this folder.

Points of Interest

The BackgroundWorker Class in .NET 2.0

.NET 2.0 has a great new class called BackgroundWorker to simplify running tasks asynchronously. Although this application sends the file in small chunks, even these small chunks would delay the Windows Forms application and make it look crashed or "hung" during the transfer. So, the web service calls still need to be done asynchronously. The BackgroundWorker class works using an event model, where you have code sections to run for DoWork (when you start), ProgressChanged (to update your progress/status bar) and Completed (or failed).

You can pass parameters to the DoWork method, which you could not do with the Thread class in .NET 1.1. I know you could do it with delegates, but delegates aren't great for thread control. You can also access the return value of DoWork in the Completed event handler. So for once, Microsoft has thought of everything and made a very clean threading model. Exceptions are handled internally and you can access them in the Completed method via the RunWorkerCompletedEventArgs.Error property.

Under a typical scenario when integrating this code into your app, you would drag a FileTransferUpload component onto your Windows Form and call RunWorkerAsync() on it to begin the transfer. Since I'm using ThreadPool to queue up all the transfers, the code to set up each file transfer is already running on a background thread. So, I want to run it in blocking/synchronous mode. To achieve this, I added a RunWorkerSync() method to the FileTransferUpload and FileTransferDownload classes. This is also useful to use for console applications where you generally want all code to be synchronous.

A Good Example of Thread.Join()

When the upload or download is complete, the client asks for an MD5 hash of the file on the server. It can thus compare it with the local file to make sure that they are identical. I originally did these in sequence, but it can take a few seconds to calculate the result for a large file, i.e. anything over a few hundred MB. So, the application was waiting five seconds for the server to calculate the hash and then five more seconds for the client to calculate its own hash.

This made no sense, so I decided to implement a multi-threaded approach to allow them to run in parallel. While the client is waiting on the server, it should be calculating its own file hash. This is done with the Thread class and use of the Join() method, which blocks execution until the thread is complete. The code below shows how this is accomplished:

// start calculating the local hash (stored in class variable)

this.hashThread = new Thread(new ThreadStart(this.CheckFileHash));
this.hashThread.Start();
    
// request the server hash 

string ServerFileHash = ws.CheckFileHash(FileName);

// wait for the local hash to complete

this.hashThread.Join();

if(this.LocalFileHash == ServerFileHash)
    e.Result = "Hashes match exactly";
else    
    e.Result = "Hashes do not match";

There is a good chance that the two operations will finish at approximately the same time, so very little waiting around will actually happen.

Common Problems and Questions

Visual Studio Compile Errors

There have been dozens of questions about people not being able to compile the solution in Visual Studio. You may get an error like this:

The type or namespace name 'MTOMWse' does not exist 
    in the namespace 'UploadWinClient.MtomWebService'. 

With an MTOM-enabled web service, Visual Studio is supposed to generate 2 proxy classes: a standard one derived from System.Web.Services.Protocols.SoapHttpClientProtocol and a WSE class derived from Microsoft.Web.Services3.WebServicesClientProtocol, with "Wse" tagged onto the end of the proxy class name.

Sometimes Visual Studio misbehaves and does not generate this class. I don't understand why, but the work-around is to "Show all files" in the Windows Forms project and expand the web service > Reference.map > Reference.cs. Edit this file and change public partial class MTOM : System.Web.Services.Protocols.SoapHttpClientProtocol to public partial class MTOMWse : Microsoft.Web.Services3.WebServicesClientProtocol. Also, make sure to update the constructor to match the new class name. Then it should compile fine.

Can You Make a Web Client? No, No and Double No!

I have also received a ton of questions about people asking for a web client instead of a Windows Forms client. This is fundamentally impossible because of the advanced file Input/Output required to achieve this solution. For good reason, browsers do not provide this level of access to the file system of the client. A guy called Brettle wrote a progress bar control for ASP.NET file uploading. This may be your best bet, although you must understand that web applications are very limited when it comes to sending large amounts of data to the web server.

Conclusions

I found that MTOM was about 10% faster than DIME in my limited testing. This is probably to do with the need to package up each chunk into a DIME attachment, which is no longer necessary with MTOM. Remember, if you want to send chunks larger than 4 MB, you must increase the .NET 2.0 max request size limit in your web.config file. Feel free to use this code and modify it as you please. Please post a comment for any bugs, suggestions or improvements. Enjoy!

Please note that the source solution is in Visual Studio 2008 file format. Rick Strahl has a good blog post on the difference in the file format. You could also try Google to find a tool to convert from the 2008 to the 2005 format.

History

  • 29 Dec 2005 - Original version posted
  • 14 Feb 2007 - Updated source download
  • 17 July 2007 - Updated source download; also added app.config entry for MTOM ClientMode=On, otherwise MTOM is not used!
  • 19 November 2007 - Updated client application to support concurrent transfers

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here