Simple C# Downloader






4.37/5 (14 votes)
Connect and download any file from the Internet using .NET and C#
Introduction
In this article, I will attempt to describe the steps required to efficiently download various files from a Web server. In addition, I am assuming that you're somewhat familiar with C# general structure as well as the HTTP protocol, especially the HTTP header.
Let's Get Started
So there are a couple of steps we need to take in order to download a file from a Website. From an abstract point of view when you're talking to an HTTP server, you're working in one of two modes: You're either sending Request(s) or you're receiving Response(s).
.NET References
First of all, you need to remember to reference System.Net
to be able to use .NET's WebRequest
and WebResponse
classes.
using System.Net;
The next thing we'll look at is the HttpUserAgent
. This tells the destination server who we are. You usually want to use this if you're crawling a Website. Some sites look at this value and load/unload certain features.
Cookies
We need to look at the Cookiecontainer
object. We use this so that we don't bombard the site given multiple downloads. Basically once we connect, the Web server checks to see if we have a cookie for the site. If one exists, then it asks for it and uses it, otherwise, we create a new one.
There are a number of items that we need to initialize before we establish a connection. The first item is the HttpWebRequest
. We initialize this variable while passing it the URL that we're connecting to. This step can be done later as well.
httpRequest = (HttpWebRequest)WebRequest.Create(siteURL);
The next variable is the status of the cookie. We do this by checking the value Static
boolean variable. If it's set then we know that we already have a cookie, otherwise we create one.
if (Downloader.IsFirstConnection)
{
httpCookie = new CookieContainer();
Downloader.IsFirstConnection = false;
}
Similarly, we initialize UserAgent
and set other settings such as AutoRedirect
. Once everything is done, we're ready to connect to the Web server. That's done by:
httpResponse = (HttpWebResponse)httpRequest.GetResponse();
Upon connection, we can check the code returned from the Web server and deal with any kind of errors if any. Upon return code 200, we can go ahead and read the HTTP header as well as the body of the response. I have intentionally left these two sections blank since you can parse and format the data as it is downloaded.
Lastly, we need to close the connection. We put this in the finally
section of the code so that even if there is an error, we still close the connection gracefully. Below is the sample code of the above put together.
namespace SimpleDownloader
{
class Downloader
{
public const string HttpUserAgent = "Sean's Agent/1.0 " +
"(compatible; SA 1.0; Windows NT 6.0; SLCC1;" +
" .NET CLR 2.0.50727; .NET CLR 3.0.04506; .NET CLR 1.1.4322;";
CookieContainer httpCookie;
public byte[] ConnectAndDownloadURL(string siteURL)
{
HttpWebRequest httpRequest = null;
HttpWebResponse httpResponse = null;
byte[] httpHeaderData = null;
byte[] httpData = null;
httpRequest = (HttpWebRequest)WebRequest.Create(siteURL);
//we check to see if it's the first time
//we're connecting so we can save the cookie
//otherwise we use the existing cookie
if (Downloader.IsFirstConnection)
{
httpCookie = new CookieContainer();
Downloader.IsFirstConnection = false;
}
httpRequest.CookieContainer = httpCookie;
httpRequest.AllowAutoRedirect = true;
httpRequest.UserAgent = Downloader.HttpUserAgent;
try
{
httpResponse = (HttpWebResponse)httpRequest.GetResponse();
if (httpResponse.StatusCode == HttpStatusCode.OK)
{
httpCookie = httpRequest.CookieContainer;
httpHeaderData = httpResponse.Headers.ToByteArray();
Stream httpContentData = httpResponse.GetResponseStream();
using (httpContentData)
{
// Now you can do what ever you want with the data here.
// i.e. convert it, parse it etc. You can write stuff to httpData
}
return httpData;
}
else
{
//Report error
return null;
}
}
catch (WebException we)
{
//Report error
}
finally
{
if (httpResponse != null)
{
httpResponse.Close();
}
}
}
}
}
Please note that the above is only meant to give you a general guideline and a starting step to communicate with a webserver. You can then tweak the settings and variables so that it meets the needs of your particular application.
Happy coding!
History
- 1st May, 2008: Initial post