A single-threaded HTTP server

Alexey A. Popov

4.59/5 (10 votes)

Nov 30, 2005

CPOL

22 min read

107186

1030

An implementaion of a simple, single-threaded HTTP server.

Download source files - 31.1 KB

Sample Image - single_threaded_nhttpd.png

Introduction

This article describes a simple HTTP server named 'The .NET HTTP Daemon', or NHttpD for short, that runs in a single thread and can be embedded into any kind of a thick client application. To understand this article and its concepts, you should be familiar with the .NET platform, C# language, and have a minimal background in C language and network programming.

The source code is partially based on the work of Sam Pullara (http://www.sampullara.com/). Some ideas for the Pvax.Net namespace are borrowed from: http://squirl.nightmare.com/medusa/async_sockets.html.

Disclaimer

This application is not a real, full-fledged Web Server for .NET, it's just a toy. It, for sure, contains bugs, and is inherently insecure. You must not use it to perform mission-critical tasks like a nuclear plant or a city water supply management. To tell the truth, you shouldn't event host your own 3-files homepage on it. If you so, do it on your own risk.

If you need a real thing, use MS IIS or XSP (http://www.mono-project.com/).

The goal of the article

I'm working on a network communication project. The software requires a kind of Web Server that delivers application-specific information to the clients through HTTP (Hypertext Transfer Protocol). Although I had some background in networking, I wanted myself to get acquainted with HTTP and network server design, so I started to write my own Web Server. I also wanted to avoid multithreading, mainly because of possible synchronization issues.

I was surprised how simple the task turned out.

HTTP protocol basics

A Web server is an application that delivers hypermedia information (from plain text to video and applications) to its clients using HTTP. Web Servers have become so widespread nowadays that you can even find one in your fridge. I'm not sure, but you should check, just in case.

There are three versions of the HTTP protocol available - 0.9, 1.0, and 1.1. Version 0.9 is outdated, and I haven't heard about servers or clients supporting it.

Protocol v. 1.0 uses sessions like:

The client (a web browser) connects to the server, usually using TCP port 80, and the server accepts the connection;
The client requests a file or other resource by sending an HTTP request (a bunch of text) to the server;
The server parses the text, sends an HTTP response (a bunch of text that may be followed by a block of binary data), and closes the connection;
The client receives the response and closes the connection.

This scheme is simple, but has a huge overhead - most files being sent over the connection are small, hundreds of bytes. A typical web page consists of at least one HTML file and a couple of images, and for each piece, the client opens a new connection. This issue is addressed by the HTTP 1.1 protocol that introduces keep-alive connections.

In general, the protocol v. 1.1 keep-alive session looks like:

The client (a web browser) connects to the server, usually using TCP port 80, and the server accepts the connection;
The client requests the data required, file by file, or by sending an HTTP request;
The server parses the request and sends an HTTP response, but if it can, it does not close the connection;
The client receives the response and, possibly, generates a new request;
When the number of incoming connections crosses some boundary, the server starts to send a special command back to the clients asking them to close the connections;
If the client receives such a command, it closes the connection. It may also close the connection under other circumstances.

The keep-alive sessions use less resources, and are obviously more efficient, so most modern browsers use them by default.

A typical HTTP request is a block of text lines: a request line and a bunch of optional request headers. The request line consists of a method name, the resource identifier, and the protocol version. HTTP 1.0 and HTTP 1.1 define a lot of methods, but my NHttpD server implements only one, the GET method.

A request header line consists of the header name and the header value, separated by a colon (':' character). Headers contain many useful information, but again, my server uses only the Keep-Alive value.

The request's end is identified by an empty line.

A typical HTTP response begins with a block of text that always contains at least one response header line, arranged the same way as a request header line. After headers goes an empty line, and may go a block of binary or text data. It can be an HTML file, an image, even a piece of code. In general, the client must interpret it on its own.

The response always has a header named Content-Length that defines a size of the data block. If the server doesn't send the data back to the client, it must include the Content-Length header anyway and set it to 0.

I think this information is more than enough to write an HTTP server. If you are interested in details, take a look at the protocol specifications. If you are not interested in it at all, use the BCL classes System.Net.HttpWebRequest and System.Net.HttpWebResponse that do all the dirty work for you.

How does a network server work?

A networking server is a background application, often a service (or a daemon in Unix world) that listens to a network port for incoming connections and performs various tasks depending on the implemented protocol and the connected client's needs. There are many ways of implementing a server, but most simple ways are usually the worst.

Classical high level approach

I remember the good old times when Java just came out. Writing network clients in Java was a breeze. You spawn a thread, resolve the server's name, connect to it, and exchange data using well-known stream I/O operations. Everything was working perfect until I started to implement a networking server. At first glance, it was easy - spawn a thread, create a ServerSocket, bind it to the local address, start to listen for incoming connections. When somebody connects to the server, you accept the connection, spawn a worker thread, and return to listening. The worker thread performs the data exchange with the client application, closes the connection when it's not needed anymore, finishes, and finally gets garbage collected.

It worked. It worked good. It worked smoothly. But only in a testing environment. In the real world, this approach failed miserably - such a Java server could bring a decent computer of those times down to its knees when about a hundred of clients tried to connect simultaneously. A typical communication between a server and a client in the scenario above took about two seconds. This means that the server would spawn hundreds of threads. A thread in Win32 is rather lightweight compared to processes and kernel-mode threads in Unix, but still...

I know that later the situation in Java was greatly improved, but by that time, I had already left the world of Java programming.

.NET approach.

Apparently, Microsoft was aware of this fault of early Java, and introduced the so-called thread pool. The idea is that usually you need a worker thread for a very short period of time (my case described above was not usual), and it's not wise to create and destroy kernel objects in that frequency. So they introduced a model, for when you needed to perform a short-lived background task (called a work item), the pool creates a new thread; when the work item has finished its buisness, the thread is not destroyed but gets suspended. The next time you spawn another work item, the pool reuses the thread.

More than that, Microsoft also introduced the so-called asynchronous calls. The idea is that you ask a network socket or a file or some other object to start a potentially lengthy operation and specify a callback method that gets called when the operation is finished. In the callback, you ask the object to provide you with the result of the operation. Or you can use a synchronization object to determine if the operation is still being performed.

As far as I can say, looking at the Rotor code for sockets, Microsoft uses overlapped I/O that's even more effective than sending and receiving data in a separate thread.

The synchronization issue

However, there's a small drawback - whether you use threads, the thread pool, or overlapped I/O, callback methods get called in the context of another thread. It means that you must synchronize the access to shared data. You may use classes like Monitor, Mutex, or AutoResetEvent, but they all map to OS resources. It means that you:

waste (or, say, use a lot more of) system resources;
should be very careful with the garbage collector and never forget to call Dispose() for these objects;
may introduce hard to debug race conditions.

I'm not an extremist as the anonymous Unix guru who created the web site called 'Threading is Evil', but he definitely has a point. A server application may perform work for hundreds and thousands of clients at once, using thousands and tens of thousands of kernel objects for threads and synchronization primitives. If we manage to make our server single threaded, we literally save all these objects, a lot of virtual memory, a lot of CPU cycles that would be wasted on context switching and synchronization calls, and thus make our server smaller and faster. The question is how?

Enter select() almighty

In 1990, the the University of California at Berkeley, the famous creators of Berkeley Unix, introduced a new networking library known as Berkeley sockets. This library defined a simple and reliable API that nowadays is implemented in all popular Operating Systems.

The API defines the term socket as an active network endpoint. Sockets can be created and destroyed, can be read and written, and look much like file handles. In fact, most of Unix-like systems treat sockets as files and vice versa. Unfortunately, it is not true for Windows. It has a networking library named Windows Sockets based on Berkeley sockets, but it differs, especially in earlier versions, because it was developed for Windows 3.x where there were no threads and even multitasking was cooperative.

If you ever looked at the System.Net.Sockets namespace, you should know how this API looks in its .NET incarnation. The core class Socket completely encapsulates the Windows/Berkeley Sockets API. According to Lutz Roeder's Reflector, .NET uses only Berkeley-compatible calls, so from here on, I do not distinguish Windows and Berkeley sockets and use only the latter's terminology.

The Sockets API contains a special call select() that allows a developer to check the status of multiple sockets at once. This call is essential for Unix programmers because at the time when sockets were invented, hardly a flavor had multithreading support. It was also essential in Win16 for the same reason.

Here's the key. A server could accept hundreds of connections, but serve, at a time, only those sockets that are ready to be read or written, with only one thread and without any synchronization. This technique is called multiplexing.

Multiplexing

Those of you who have programmed pure Petzold-style Windows applications may feel nostalgia. I do. The application that uses multiplexing becomes a big loop with the select() call in its core. That loop would call the proper 'event handlers' that perform data receiving, parsing, and other tasks for a particular socket. It looks much like a pure C Windows application's 'message loop'.

If you looked at, for instance, a Unix network server's source code, you would notice that the code looks extremely ugly and unclear. I think that is the reason why many network applications for Unix nowadays are written in languages like Python or Ruby; the code looks much clearer and nicer.

In the .NET BCL, the Socket class has a static method named Select() that wraps the plain select() call and makes its use very convenient. You pass four incoming parameters: three of type IList for sockets that are ready to be read, to be written to, or got to the error state, and a wait timeout in microseconds. The call waits for the socket's state change, then modifies ILists so they contain only those Socket objects that can be read, written, or are in the error state. The calling code just has to use a foreach statement for all three lists. Again, comparison to pure C code shows the advantages of the C# language.

Non-blocking sockets

Another useful 'trick' is to put a socket into non-blocking mode. It means that any call to the send() and recv() functions (Socket.Send() and Socket.Receive() methods, respectively, for .NET) returns immediately. For recv(), if a buffer associated with the socket contains any data, the call would return them; otherwise, it returns the EWOULDBLOCK (SocketError.WouldBlock constant for .NET 2.0) error code. For send(), if the buffer has any space for data, it would accept them; otherwise, it also returns EWOULDBLOCK.

Non-blocking sockets are good. However, the idea of an error code that actually reports a state, not an error, has a nasty side effect in .NET. This platform uses exceptions to signal errors, not states. Thus, if you turn a Socket to non-blocking mode by setting its Blocking property to true, be ready to catch SockectException exceptions and test their ErrorCode property (SocketErrorCode in .NET 2.0) for the EWOULDBLOCK value. Obviously, exceptions are heavy-weight compared to error codes, so do your best to avoid them - use the Socket.Select() static method.

There is another issue with non-blocking sockets - you cannot use a very convenient NetworkStream class for data exchange because it is designed to use only blocking sockets. You have to send and receive blocks of bytes manually. You can find sources of many .NET networking programs on the Internet. They either use high-level classes like TcpClient and NetworkStream, or look like a horrible mess of byte array manipulations and hacks involving the Array, BitConverter, and Buffer classes. However, it can be avoided. A little bit later I show you how.

HTTP server object model

Now we know all we need to write a Web Server. First, we should define its object model and then provide its implementation.

Note that all classes written for the server are not thread safe.

Multiplexer class

This sealed (or static for C# 2.0) class is an application-wide Registry of active sockets. Objects that work with sockets may add themselves to this registry and start receive notification about their associated socket states. When the time comes, the sockets should be closed and the objects should remove themselves from the Registry. It's rather funny - you create an object, register it in the Multiplexer, and it becomes "alive", almost like a thread.

Notifications are get called from the static Poll() method. The class does not implement a poll loop of any kind; it's your responsibility, as a a developer, to periodically call this method. For example, in a Windows Forms application, it can be done by putting a System.Windows.Forms.Timer object to your main form and registering the Tick event handler. If there is at least one object registered, Poll() returns true; if there are no sockets registered, it return false - may be it's time to shutdown your application.

The Multiplexer is global. In theory, I should have it implemented as a per-thread singleton, but I believe it's overkill. Multiplexed threads - what a mess!

IMultiplexed interface

The Multiplexer is closely related to the IMultiplexed callback interface. Only those objects that implement this interface can be registered in the Multiplexer and receive notifications.

At the beginning, I used a delegate type for three callback methods, but had to implement all three of them over and over. And the AddSocket() call looked extremely ugly. Eventually, I decided to declare a callback interface with three methods, one for each notification.

According to Microsoft's naming guidelines, this interface should have been named IMultiplexable, but I couldn't pronounce this word.

NonBlockingSocket and SocketError classes

As I said above, using a SocketException to report the state of the Socket wasn't a very brilliant idea. Even worse, in .NET 1.0 and .NET 1.1, Microsoft defined the SocketException.ErrorCode's type as System.Int32. Thus, in 1.1, you have to declare WinSock error codes as System.Int32 values, not as enums. In .NET 2.0, they fixed the situation by adding a new typed property SocketErrorCode to SocketException and defining an enumeration SocketError with symbolic names for all WinSock error codes including EWOULDBLOCK. Because my primary development platform is still .NET 1.1, I have to define my own version of the SocketError as a pseudo-enumeration.

A NonBlockingSocket class is a thin wrapper around the Socket. Those methods of the Socket that can throw the SocketException with EWOULDBLOCK ErrorCode are wrapped in try/catch statements that filter out the state error codes. To help the developer to remember that the NonBlockingSocket is merely a wrapper, I introduced a static factory method NonBlockingSocket.FromSocket() and made the constructor private. The class also maintains a map of its objects so a NonBlockingSocket object is created for a particular Socket only once. The drawback here is that you must always call the NonBlockingSocket.Close() method.

CircularBuffer class

This low-level class represents a simple buffer that can grow and be accessed as a FIFO queue of bytes.

The PutBytes() method accumulates blocks of raw data in the beginning of the buffer. The GetBytes() method returns a block of data form the end of the buffer. However, because I use non-blocking send operations, I cannot learn the number of bytes I can send before the actual call, so I use the GetBuffer() method that gives me direct access to the underlying bytes, and after sending something, I use the DropBytes() method that discards the sent block from the buffer.

An interesting method of this class is FindDelimiter(). It searches the buffer for a byte pattern. I'm not aware of any network protocol that exchanges blocks of data of fixed length, so we deal either with blocks with prefixed length or blocks with delimiters. The FindDelimiter() method performs this task. If it returns a non-negative integer, the buffer contains at least one complete message.

An interesting detail - the HTTP protocol uses both delimited and length-prefixed messages, and it sometimes makes life difficult.

One should be highly careful while working with objects of this class, it is very easy to corrupt the data in such a buffer.

CircularStream class

This class represents a stream on top of a CircularBuffer. The reading is being performed at the end of the buffer, and writing is being performed at the beginning of the buffer using the PutBytes() and the GetBytes() methods.

This class adapts the CircularBuffer for the TextReader, TextWriter, BinaryReader, and BinaryWriter classes from the System.IO namespace that provides convenient facilities for formatted reading and writing of textual and binary data. However, use them with caution. For instance, TextWriter descendants buffer the output, and if you don't call the Flush() method on it, you will sure lose some outgoing data. And if you haven't received a complete message, a read operation is likely to block forever.

Server class

The Server class represents, as you guessed, the server that listens to the specified port for incoming connections. It registers itself in the Multiplexer, and gets notified by the IMultiplexed.ReadyRead call when a connection is coming. The IMultiplexed.ErrorState call closes the listening socket. I wonder if IMultiplexed.ReadyWrite ever gets called for a Server object.

Connection class

The Connection class represents a single client connection. It maintains the state of the connection, performs request parsing, and forms outgoing responses. This class has only one public method, the constructor. The rest of the work is performed mostly in IMultiplexed.ReadyRead and IMultiplexed.ReadyWrite calls that are not exposed.

Options class

The Options singleton keeps all the settings of the server. Note that it compiles differently, depending on the presence of the DEBUG define. In debug build, the NHttpD would by default listen to port number 8001; in release build, obviously, to port 80.

Globals class

This class represents a singleton with the server global data. So far it contains a single property, VersionText, that is a textual representation of the application's name and its version in HTTP-compatible format. In case this work will be developed further, I'm planning to put here all data shared between different Webs hosted by the server.

Entry class

This is the class that defines the application's entry point. Its static Main() method parses the arguments, adjusts the Options singleton's properties, creates a Server instance, and enters the main loop. Note that I set timeouts to 100000 microseconds. It means that NHttpD can serve 10 incoming connections per second. That's enough for testing purposes. If you like to see the true power of NHttpD, set the timeout to 0 and it will serve an infinite number of connections. May be.

Building the sample

The NHttpD is a simple console application, no fancy windows and icons.

As the primary development tool, I use SharpDevelop. You can find it's combine and project in the archive accompanying the article.

I also have a build file for the NAnt build system; you'll find it in the archive. It has four main targets: Debug that generates the debug version of NHttpD, Release that builds the release version, Doc that generates the Help file using NDoc, and Clean that removes all target files.

For those who have neither #D nor NAnt, I have included the build.bat file that compiles the debug version of the server.

I have no Visual Studio, and cannot provide you with a solution file. I think it would be easy to create one, just don't forget to add the proper defines:

_DEBUG, DEBUG, and NET11 for debug build;
NDEBUG and NET11 for release build.

Note that in the code, you will find NET20 defines and conditional compilation. Take into account that I haven't tested the NHttpD under .NET 2.0 yet.

Improvements

The NHttpD server can surely be improved in many ways. Here's a short list of improvements that can be made fairly easily:

Keyboard polling so the administrator is able to terminate NHttpD gracefully;
Better logging with NLog or log4net;
Multiple Server objects using different ports and web roots; this surely requires moving some of the Options class functionality to the Server class and establishing a Client-Server relationship (in the current implementation, objects of both classes are decoupled);
Support for more HTTP methods; HEAD is the easiest to implement;
Better HTTP headers support;
Support for custom error descriptions, probably in the form of HTML templates;
Configuration file or some other kind of configuration store;
A control application, an MMC shap-in or a built-in web site for the server administration;
In-memory cache for resources; probably use the System.Web.Caching.Cache, or simply use a Hashtable that maps resource identifiers to WeakReferences;
MIME types support and filtering.

However, the most important area is CGI and ASP.NET implementation. That could turn NHttpD to a real web server.

Lessons learnt

It seems to me that the objects that implement the IMultiplexed interface represent a limited form of continuations or coroutines.

There is an issue with sockets in .NET. They are not serializable, and are not derived from MarshalByRefObject, thus they are not able to cross the AppDomain boundaries. It may be connected with the fact that sockets are actually machine-wide entities. I'm not aware of any consequences of such an implementation.

Although neither C# nor .NET provides such flexible operations for byte array manipulation as the Python language does, we can achieve similar results using the System.IO namespace data formatting classes, with a MemoryStream or a CircularStream object as the base stream.

Questions left open

There are some questions for which I have no answers so far. The main is does the multiplexing scale? I mean, would the NHttpD or its improved version survive heavy load? Is it worth using multiplexing to create business and industrial applications, or better stick to asynchronous I/O as Microsoft recommends?

Conclusion

With the advent of Microsoft .NET, creation of multithreaded web servers became a trivial task. Now you know that creation of a single-threaded web server is trivial too. You can embed the NHttpD to your application and make it manageable through a browser by anyone in the world because security features aren't implemented!

License

The code in the Pvax.Net namespace is covered by BSD-like license. You may use it or base your work on it in your applications, either commercial or not, at your own risk without any limitation, but you must state the origins of the original code.

The rest of the application is considered public domain. You may do with it whatever you want at your own risk.

History

2005-Nov-30

Initial version posted.

2005-Dec-18

Keep-alive sessions expiration added; NHttpD now reports its version and the time on the server, that helps a browser with caching.