![]() |
General Programming »
Internet / Network »
General
Intermediate
IOCPNet - Ultimate IOCPBy sleepyrea (new)Easy to use, high performance, large data transfer by using IO Completion Port. |
VC6, VC7, VC7.1Win2K, WinXP, Win2003, MFC, VS.NET2003, Dev
|
|
Advanced Search |
|
|
|
||||||||||||||||
There are many articles on IOCP (Input/Output Completion Port). But they are not easy to understand because the IOCP technique itself has some arcane things and it doesn't have related standard documents which have enough explanation or code samples. So, I decided to make an IOCP sample (OIOCPNet) in high performance and write a document that deals with the way IOCP operates and its related key issues.
I focused on:
OIOCPNet class. Yeah, the first thing is IOCP. Well, why should we use IOCP? If we use the well known select function (with FD_SET, FD_ZERO, ...), we can't help looping to detect socket events which means that the socket has some received or sent data packets. And when we develop a game server or a chat server, a socket is used as an ID of the user action. So to find the user data on the server, we use a finding loop or hash tables with the socket number. Loops are very serious in making the server slow when the number of users is more than tens of thousands. But with IOCP, we need not do these loops. Because IOCP detects socket events at the kernel level and IOCP provides the mechanism to associate a socket (i.e. completion port) with a user data pointer directly. In short, with IOCP we can avoid loops and get the user data on the server side faster.
By using Accept (or WSAAccept) we get WSAENOBUFS (10055) error, when the number of (almost) concurrent connections is more than 30,000 (it depends on the system resource). The reason for the error is that the system can't catch up with preparing the system resource for a socket structure as fast as connections are made. So we should find a way to make socket resources before we use them, and AcceptEx is the answer. The main advantage of AcceptEx is just this - preparing sockets before use! The other features of AcceptEx are pesky, and not understandable. (See MSDN Library.)
The use of static memory (or pre-allocated memory) on server side applications is somewhat natural and crucial. When we receive or send packets, we must use static memory. In OIOCPNet, I use my own class (OPreAllocator) to get the pre-allocated memory area.
Have you ever met with a situation where you had to sent a large data packet (more than thousand bytes) using one function call (like WriteFile, WSASend or send) and then the receiver didn't get the data packet you had sent? If you have met, then you might have met with the problem of network hardware (routers, HUBs, and so on) and buffer - MTU (Most Transfer Unit). The least MTU of network hardware is 576 bytes, so it is better that the large packet is sliced into many smaller packets less than the least MTU size. In OIOCPNet, I have defined the unit data block size as BUFFER_UNIT_SIZE (512 bytes). If you need a bigger one, you can change it.
If your server logic has some kind of IO operations, it may be better to spawn many threads. Because threading is meaningful only if the environment has IO operations. But don't forget 'the more threads, the more efforts of CPU for thread scheduling'. If there are more than 10,000 threads and they are running, the operating system and the processes can't hold their normal running state, because CPU pumps all its capability into finding which thread runs next time - scheduling or context switching. For reference, OIOCPNet has two (experimental value) threads per CPU and doesn't spawn any more.
OIOCPNet is the class applied with the above ideas. The operation steps of OIOCPNet are the following:
OIOCPNet prepares its resources like pre-allocated memory area, completion port, other handles and so on.
OIOCPNet makes a listening socket.
OIOCPNet pre-generates sockets (65,000, but I defined it as 30,000 in IOCPNet.h for OS not Win 2003, change MAX_ACCEPTABLE_SOCKET_NUM depending on your needs) and its own buffered sockets, and then puts them into acceptable mode by using AcceptEx.
OIOCPNet accepts it.
OIOCPNet puts them into its pre-allocated reading slots and then puts an event for use of the server logic.
OIOCPNet puts them into its pre-allocated writing blocks and then calls PostQueuedCompletionStatus so that a worker thread sends the data packets.
OIOCPNet closes the socket but it doesn't release the memory of the buffered socket, just re-assigns it. The following picture shows the entire mechanism of OIOCPNet. It is very simple:

GetQueuedCompletionStatus and PostQueuedCompletionStatus lack the parameter to present the result of the IO operation. Besides the default parameters of GetQueuedCompletionStatus (or PostQueuedCompletionStatus), OIOCPNet needs more parameters for classifying the type of IO operation and a little additional information. So I used the LPOVERLAPPED parameter of GetQueuedCompletionStatus and PostQueuedCompletionStatus as my custom parameter like the thread parameter (LPVOID lpParameter, the fourth parameter) of CreateThread. OVERLAPPEDExt is the extended type of OVERLAPPED structure and it has more information. See the definition code below:
struct OVERLAPPEDExt { OVERLAPPED OL; int IOType; OBufferedSocket *pBuffSock; OTemporaryWriteData *pTempWriteData; }; // OVERLAPPEDExt
In OIOCPNet, WSASend and WSARecv operate in an asynchronous way. So take care of the life time of the variables passed to the asynchronous functions.
// pTempWriteData will be freed when send IO ends. pTempWriteData = (OTemporaryWriteData *) m_SMMTempWriteData.Allocate(sizeof (OTemporaryWriteData)); ... // the size of pData // (the second parameter of GetBlockNeedsExternalLock) // does not be over BUFFER_UNIT_SIZE. m_pWriteBlock->GetBlockNeedsExternalLock (&pBuffSockToWrite, pTempWriteData->Data, &ReadSizeToWrite, &DoesItHaveMoreSequence); ... try { ResSend = WSASend(pTempWriteData->Socket, &pTempWriteData->DataBuf, 1, &WrittenSizeUseless, Flag, (LPOVERLAPPED)&pTempWriteData->OLExt, 0); }
In the above code snippet, pTempWriteData is allocated for being used by WSASend. WSASend returns immediately, but pTempWriteData must be alive until the real sending operation of WSASend at the OS level is over. When the sending operation is over, then release pTempWriteData like this:
if (0 != pOVL) { if ((IO_TYPE_WRITE_LAST == ((OVERLAPPEDExt *)pOVL)->IOType || IO_TYPE_WRITE == ((OVERLAPPEDExt *)pOVL)->IOType)) { if (0 != ((OVERLAPPEDExt *)pOVL)->pTempWriteData) { m_SMMTempWriteData.Free( ((OVERLAPPEDExt *)pOVL)->pTempWriteData); } continue; } }
A normal SOCKET number itself is unique. But the OS assigns the socket number arbitrarily, the latest closed socket number could be re-assigned to a new socket connected right next to it. So it could be that:
To prevent this troublesome situation, OIOCPNet manages its own socket number SocketUnique, a member of OBufferedSocket.
The usage of OIOCPNet is simple. See the following code snippet:
int _tmain(int argc, _TCHAR* argv[]) { ... WSAStartup(MAKEWORD(2,2), &WSAData); pIOCPNet = new OIOCPNet(&EL); pIOCPNet->Start(TEST_IP, TEST_PORT); hThread = CreateThread(0, 0, LogicThread, pIOCPNet, 0, 0); ... InterlockedExchange((long *)&g_dRunning, 0); WaitForSingleObject(hThread, INFINITE); ... pIOCPNet->Stop(); delete pIOCPNet; WSACleanup(); return 0; } // _tmain() DWORD WINAPI LogicThread(void *pParam) { ... while (1 == InterlockedExchange((long *)&g_dRunning, g_dRunning)) { iRes = pIOCPNet->GetSocketEventData(WAIT_TIMEOUT_TEST, &EventType, &SocketUnique, &pReadData, &ReadSize, &pBuffSock, &pSlot, &pCustData); if ... else if (RET_SOCKET_CLOSED == iRes) { // release pCustData. continue; } // Process main logic. MainLogic(pIOCPNet, SocketUnique, pBuffSock, pReadData, ReadSize); pIOCPNet->ReleaseSocketEvent(pSlot); } return 0; } // LogicThread() void MainLogic(OIOCPNet *pIOCPNet, DWORD SocketUnique, OBufferedSocket *pBuffSock, BYTE *pReadData, DWORD ReadSize) { pIOCPNet->WriteData(SocketUnique, pBuffSock, pReadData, ReadSize); // echo. } // MainLogic()
We can set the IP address and port number with Start which prepares the necessary resources. In logic thread we can get the data packets with GetSocketEventData and we can send data packets with WriteData. After using the data, release pSlot has the pointer (pReadData) that indicates the data packet with ReleaseSocketEvent. Finally, when the main logic ends, call Stop to that OIOCPNet which releases its resource. That's all.
OIOCPNet slices a large data packet into smaller packets. It adds 4-bytes packet length information to the original data packet. But the slicing and assembling operation is abstracted by GetSocketEventData and WriteData of OIOCPNet. So, we need not care about it. But you should use TCPWrite and TCPRead (see TCPFunc.h, TCPFunc.cpp in NetTestClient project) to communicate with OIOCPNet when you make the client side application connect to the server.
I compiled OIOCPNet in .NET 1.1 environment. (also VC++ 6.0, blocking #include "stdafx.h"). And I located the server (IOCPNetTest) in Windows 2003 Enterprise Edition and located the test clients (NetTestClient) in several machines. The specification and performance result:
When a client can't generate more than 5,000 (~ 2,000) connections to the server, check the registry. The checking step includes:
If you need to increase the thread number of your test client to more than 2,0xx, revise the function stack size of the client application using compile option '/STACK:BYTE' or a parameter of CreateThread. Before you run the test server and test client, set TEST_IP and TEST_SERVER_IP with the IP address of your server. To see the connection number, use performance monitor or 'netstat -s' in command prompt.
BindIoCompletionCallback.)
General
News
Question
Answer
Joke
Rant
Admin
|
PermaLink |
Privacy |
Terms of Use
Last Updated: 17 Jan 2006 Editor: Smitha Vijayan |
Copyright 2005 by sleepyrea (new) Everything else Copyright © CodeProject, 1999-2009 Web19 | Advertise on the Code Project |