There are many articles on IOCP (Input/Output Completion Port). But they are not easy to understand because the IOCP technique itself has some arcane things and it doesn't have related standard documents which have enough explanation or code samples. So, I decided to make an IOCP sample (
OIOCPNet) in high performance and write a document that deals with the way IOCP operates and its related key issues.
I focused on:
- More than 65,000 concurrent connections (the maximum port number (unsigned short(65535)) of IP version 4).
- Function to transfer more than thousand bytes through the network.
- Easy method for users of the
Key ideas to achieve the objectives
Yeah, the first thing is IOCP. Well, why should we use IOCP? If we use the well known
select function (with
FD_ZERO, ...), we can't help looping to detect socket events which means that the socket has some received or sent data packets. And when we develop a game server or a chat server, a socket is used as an ID of the user action. So to find the user data on the server, we use a finding loop or hash tables with the socket number. Loops are very serious in making the server slow when the number of users is more than tens of thousands. But with IOCP, we need not do these loops. Because IOCP detects socket events at the kernel level and IOCP provides the mechanism to associate a socket (i.e. completion port) with a user data pointer directly. In short, with IOCP we can avoid loops and get the user data on the server side faster.
WSAAccept) we get
WSAENOBUFS (10055) error, when the number of (almost) concurrent connections is more than 30,000 (it depends on the system resource). The reason for the error is that the system can't catch up with preparing the system resource for a socket structure as fast as connections are made. So we should find a way to make socket resources before we use them, and
AcceptEx is the answer. The main advantage of
AcceptEx is just this - preparing sockets before use! The other features of
AcceptEx are pesky, and not understandable. (See MSDN Library.)
The use of static memory (or pre-allocated memory) on server side applications is somewhat natural and crucial. When we receive or send packets, we must use static memory. In
OIOCPNet, I use my own class (
OPreAllocator) to get the pre-allocated memory area.
Sliced data chunk
Have you ever met with a situation where you had to sent a large data packet (more than thousand bytes) using one function call (like
send) and then the receiver didn't get the data packet you had sent? If you have met, then you might have met with the problem of network hardware (routers, HUBs, and so on) and buffer - MTU (Most Transfer Unit). The least MTU of network hardware is 576 bytes, so it is better that the large packet is sliced into many smaller packets less than the least MTU size. In
OIOCPNet, I have defined the unit data block size as
BUFFER_UNIT_SIZE (512 bytes). If you need a bigger one, you can change it.
Don't spawn many threads
If your server logic has some kind of IO operations, it may be better to spawn many threads. Because threading is meaningful only if the environment has IO operations. But don't forget 'the more threads, the more efforts of CPU for thread scheduling'. If there are more than 10,000 threads and they are running, the operating system and the processes can't hold their normal running state, because CPU pumps all its capability into finding which thread runs next time - scheduling or context switching. For reference,
OIOCPNet has two (experimental value) threads per CPU and doesn't spawn any more.
OIOCPNet - the Key
OIOCPNet is the class applied with the above ideas. The operation steps of
OIOCPNet are the following:
OIOCPNet prepares its resources like pre-allocated memory area, completion port, other handles and so on.
OIOCPNet makes a listening socket.
OIOCPNet pre-generates sockets (65,000, but I defined it as 30,000 in IOCPNet.h for OS not Win 2003, change
MAX_ACCEPTABLE_SOCKET_NUM depending on your needs) and its own buffered sockets, and then puts them into acceptable mode by using
- When a user tries to connect to the server,
OIOCPNet accepts it.
- When a socket reads data packets,
OIOCPNet puts them into its pre-allocated reading slots and then puts an event for use of the server logic.
- When the sever logic writes data packets,
OIOCPNet puts them into its pre-allocated writing blocks and then calls
PostQueuedCompletionStatus so that a worker thread sends the data packets.
- When a user closes the connection,
OIOCPNet closes the socket but it doesn't release the memory of the buffered socket, just re-assigns it.
The following picture shows the entire mechanism of
OIOCPNet. It is very simple:
Key points when writing the code
PostQueuedCompletionStatus lack the parameter to present the result of the IO operation. Besides the default parameters of
OIOCPNet needs more parameters for classifying the type of IO operation and a little additional information. So I used the
LPOVERLAPPED parameter of
PostQueuedCompletionStatus as my custom parameter like the thread parameter (
LPVOID lpParameter, the fourth parameter) of
OVERLAPPEDExt is the extended type of
OVERLAPPED structure and it has more information. See the definition code below:
Life time of a variable used by an asynchronous function
WSARecv operate in an asynchronous way. So take care of the life time of the variables passed to the asynchronous functions.
pTempWriteData = (OTemporaryWriteData *)
ResSend = WSASend(pTempWriteData->Socket,
In the above code snippet,
pTempWriteData is allocated for being used by
WSASend returns immediately, but
pTempWriteData must be alive until the real sending operation of
WSASend at the OS level is over. When the sending operation is over, then release
pTempWriteData like this:
if (0 != pOVL)
if ((IO_TYPE_WRITE_LAST ==
|| IO_TYPE_WRITE ==
if (0 != ((OVERLAPPEDExt *)pOVL)->pTempWriteData)
The uniqueness of socket
SOCKET number itself is unique. But the OS assigns the socket number arbitrarily, the latest closed socket number could be re-assigned to a new socket connected right next to it. So it could be that:
- A socket is assigned with a socket number 3947 (as an example) for new connection.
- The server logic reads data packets using the socket.
- The socket is closed suddenly for user closing while the server logic doesn't know about that fact.
- A different socket is assigned with the same socket number 3947, (the resurrection of that socket number).
- The server logic writes data packets to the socket, the server meets with no problem to do so. But the data packets might be sent to a different user as a result.
To prevent this troublesome situation,
OIOCPNet manages its own socket number
SocketUnique, a member of
How to use OIOCPNet
The usage of
OIOCPNet is simple. See the following code snippet:
int _tmain(int argc, _TCHAR* argv)
pIOCPNet = new OIOCPNet(&EL);
hThread = CreateThread(0, 0, LogicThread,
pIOCPNet, 0, 0);
InterlockedExchange((long *)&g_dRunning, 0);
DWORD WINAPI LogicThread(void *pParam)
while (1 == InterlockedExchange((long *)&g_dRunning,
iRes = pIOCPNet->GetSocketEventData(WAIT_TIMEOUT_TEST,
&EventType, &SocketUnique, &pReadData,
&ReadSize, &pBuffSock, &pSlot, &pCustData);
else if (RET_SOCKET_CLOSED == iRes)
MainLogic(pIOCPNet, SocketUnique, pBuffSock,
void MainLogic(OIOCPNet *pIOCPNet, DWORD SocketUnique,
OBufferedSocket *pBuffSock, BYTE *pReadData, DWORD ReadSize)
pReadData, ReadSize); }
We can set the IP address and port number with
Start which prepares the necessary resources. In logic thread we can get the data packets with
GetSocketEventData and we can send data packets with
WriteData. After using the data, release
pSlot has the pointer (
pReadData) that indicates the data packet with
ReleaseSocketEvent. Finally, when the main logic ends, call
Stop to that
OIOCPNet which releases its resource. That's all.
Take care of read and write at client side
OIOCPNet slices a large data packet into smaller packets. It adds 4-bytes packet length information to the original data packet. But the slicing and assembling operation is abstracted by
OIOCPNet. So, we need not care about it. But you should use
TCPRead (see TCPFunc.h, TCPFunc.cpp in NetTestClient project) to communicate with
OIOCPNet when you make the client side application connect to the server.
OIOCPNet in .NET 1.1 environment. (also VC++ 6.0, blocking
#include "stdafx.h"). And I located the server (IOCPNetTest) in Windows 2003 Enterprise Edition and located the test clients (NetTestClient) in several machines. The specification and performance result:
- Test Server - OS: Windows 2003 Enterprise Edition
- Test Server - CPU: Intel 2.8GHz (x 2)
- Test Server - RAM: 2GB
- Test Client: Windows XP (3~5 machines used, changing thread number)
- Result: about 15% ~ 20% CPU Usage (when established TCP connection number is 65,000)
When a client can't generate more than 5,000 (~ 2,000) connections to the server, check the registry. The checking step includes:
- Run regedit
- Open 'HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters'
- Add 'MaxUserPort' as DWORD value and set the value (maximum value is 65534 in decimal number).
If you need to increase the thread number of your test client to more than 2,0xx, revise the function stack size of the client application using compile option '/STACK:BYTE' or a parameter of
CreateThread. Before you run the test server and test client, set
TEST_SERVER_IP with the IP address of your server. To see the connection number, use performance monitor or 'netstat -s' in command prompt.
- August, 2005
- IOCPNet first version.
- Fixed a bug during the ending process.
- Added a new demo and src, using Windows thread pool. (Because there've been some requests for the sample uses