Click here to Skip to main content
Click here to Skip to main content

Tagged as

Go to top

Simple lightweight SSL IOCP Sockets

, 17 Sep 2013
Rate this:
Please Sign up or sign in to vote.
Easy to use (and reuse!), lightweight SSL IOCP Sockets implementation

 Introduction 

There are several implementations of IOCP sockets out there. Also, there are several implementations of SSL sockets available. But I needed both – IOCP with SSL. Also, I wanted it as simple as possible, without inclusion of huge libraries, like OpenSSL. Fortunately, Windows provides all mechanisms needed, so this combo can be implemented pretty easily. (Note: this article assumes that reader does have at least some knowledge of sockets and IOCP).

Background

Presented socket classes are lockless. This is achieved by following these conventions:

1. One IOCP is serviced by one and only one IOCP thread.

2. One socket is serviced by one and only one working thread (if any).

Why these conventions are needed, and what does it mean in practice? IOCP does read and write socket data sequentially – in another words, the order of packets sent/received is preserved. But once packet leaves IOCP (i.e. we start processing it), it is in our hands now. For any meaningful processing we need to preserve the ordering. If we read/write data from/to IO completion port using more than one thread, we’ll need some synchronization mechanism, so we read/write in correct order. This introduces unnecessary complexity without any gain in performance – tests have shown that one reading/writing thread per IOCP is more than enough (at least on Windows platform). To be specific, according to Microsoft, even on local TCP loop, you can get max 37000 round-trips per sec (6 core AMD 3.2 GHz) – which is easily digested by single thread, provided that all it does – sends and receives data. So, this is where the first convention comes from. The second convention is based on the same reason: if socket data from one socket is processed by one working thread only, then we don’t need any synchronization mechanism. 

In practice it means, that we create one IOCP for several sockets, then subset of these sockets (or even all of them) are processed by one working thread. Of course, there can be many IOCPs, as well as many working threads; all is needed – just to follow the above conventions.

Now, how it looks in the code?

There are 4 main classes:

1. CSimpleIOCP. This is main class that implements IO completion port and one IOCP thread.

2. CSimpleSocket. This is an actual socket implementation. It can work as standalone socket (i.e. blocking socket), or it can take an existing CSimpleIOCP as a parameter; in a latter case it becomes non-blocking socket, and should be used only via CSimpleIOCP callback functions (more on that later).

3. CSimpleIOCPPool. This is just a collection (pool) of CSimpleIOCP instances. 

4. CSimpleServerSocket. This is an extension of the socket that acts as a server. All is does – listens on a specified address/port. Once is gets a connection, it calls a callback function, where you initialize CSimpleSocket with given SOCKET, and then continue with your program logic.

Also, there are few helper classes: 

5. CSimpleNetManager. This singleton class is used internally by previous 4 classes for network initialization, closure, adapter questioning etc.

6. CSimpleSSL. This class used internally by CSimpleSocket for SSL processing (if any).

This implementation uses fast delegates (http://www.codeproject.com/Articles/7150/Member-Function-Pointers-and-the-Fastest-Possible) for callback functionality. 

Using the code 

Let’s start putting pieces together. 

Implementing server (you can skip this part if you need client only).

The following code is an example of barebone implementation of the application that is a set of simple echo servers:

class CSimpleServer: public CSimpleServerSocket
{
   public:
      CSimpleServer() : CSimpleServerSocket(), m_useSSL(false) {};
      void setUseSSL(bool useSSL) {m_useSSL = useSSL;};
      bool isUseSSL() {return m_useSSL;}
   private:
      bool  m_useSSL;
};
class CSimpleClientContext
{
public:
   CSimpleSocket& getSocket() {return m_socket;}
private:
   CSimpleSocket  m_socket;
};
// Implementing simple echo server - send back everything it receives
class CEchoServer
{
public:
   CEchoServer()        
   { 
      m_iocpPool += DELEGATE(onIOCPEvent); 
      m_iocpPool += DELEGATE(onSocketEvent);
   }
   ~CEchoServer()                                                           
   { 
      stop(); 
   }
   bool startServer(const char *pAddr, short uPort, bool useSSL = false)
   {
      CSimpleServer* server = new CSimpleServer();
      if (!server)
         return false;
      server->setUseSSL(useSSL);
      *server += DELEGATE(onServerAccept); 
      *server += DELEGATE(onConnectionVerify); 
      if (server->startServerOn(pAddr, uPort))
      {
         m_servers.push_back(server);
         return true;
      }
      delete server;
      return false;
   }
   bool onServerAccept(SOCKET acceptedSocket, CSimpleServerSocket& server)    
   { 
      return m_iocpPool.addSocket(acceptedSocket, &server); 
   }
   bool onConnectionVerify(CSimpleServerSocket& server, SocketAcceptVerifyStruct& verifyData)
   {
      // see docs for parameters at http://msdn.microsoft.com/en-us/library/windows/desktop/ms741513(v=vs.85).aspx
      // for now, as an example, let's just check on IP address of the connecting party, and refuse connection from "127.0.0.15":
      std::string remoteIP = inet_ntoa(((struct sockaddr_in *)verifyData.lpCallerId->buf)->sin_addr);
      return remoteIP.compare("127.0.0.15") != 0;
   }
   void stop()
   {
      for (auto it = m_servers.begin(); it != m_servers.end(); ++it)
         delete (*it); // this will also close the server
      m_servers.clear();
      m_iocpPool.close();
      for (auto it = m_clients.begin(); it != m_clients.end(); ++it)
         delete (*it);
      m_clients.clear();
   }
   bool setIOCPInitialCount(DWORD iocpInitialCount)
   {
      return m_iocpPool.setIOCPCount(iocpInitialCount);
   }
   void  onIOCPEvent(IOCPEVENT* eventData, CSimpleIOCP* iocp, DWORD eventID)
   {
      switch (eventID)
      {
      case SCD_SOCKET_ADD:
         {
            CSimpleClientContext* context = new CSimpleClientContext();
            CSimpleSocket& sock = context->getSocket();
            sock.setUserData(context);
            m_clients.push_back(context);
            if (sock.initSocket(iocp, eventData->data.sockData.sock))
            {
               sock.setNoDelayOption(true);
               if (((CSimpleServer*)eventData->data.sockData.requestor)->isUseSSL() ? sock.initSSL(false) : sock.receive(sock.getDefaultBufferSize()))               
                  return; // success
               else
                  sock.close(); // failed to init SSL or failed to receive data - in this app we don't care; just close connection.
            }
            else
               ::closesocket(eventData->data.sockData.sock); // something of above failed, but we don't really care in this app wat exactly happened, so we just terminate connection and client socket
         }
         break;
      case SCD_SOCKET_IOCP_STOP_REQUESTED:
         for (auto it = m_clients.begin(); it != m_clients.end(); ++it)
            (*it)->getSocket().close();
         break;
      }
   }
   void  onSocketEvent(CSimpleSocket* sock, SIMPLEWSAOVERLAPPED* overlapped, DWORD socketEvent)
   {
      switch (socketEvent)
      {
      case SCD_SOCKET_READ:
         {
            if (doEcho(sock, overlapped))
               return;
            std::cout << "failure in server SCD_SOCKET_READ\n";
            sock->close();            
         }
         break;
      case SCD_SOCKET_WRITE:
         break;
      case SCD_SOCKET_SSL_INIT_COMPLETED:
            if (sock->receiveSSL(sock->getDefaultBufferSize(), &overlapped->ssl_leftover))
               return;
            std::cout << "failure in server SCD_SOCKET_SSL_INIT_COMPLETED\n";
            sock->close();
         break;
      case SCD_SOCKET_CLOSE_COMPLETED:
            std::cout << "calling delete from server SCD_SOCKET_CLOSE_COMPLETED\n";
            deleteClient(sock);
         break;
      }
   }
   virtual bool doEcho(CSimpleSocket* sock, SIMPLEWSAOVERLAPPED* overlapped)
   {
      if (sock->isSSL() ? sock->receiveSSL(sock->getDefaultBufferSize(), &overlapped->ssl_leftover) : sock->receive(sock->getDefaultBufferSize()))
         if (!overlapped->bytesPassed || sock->isSSL() ? sock->sendSSL(overlapped->bytesPassed, &overlapped->buffer[0]) : sock->send(overlapped->bytesPassed, &overlapped->buffer[0]))
            return true;
      return false;
   }
   void deleteClient(CSimpleSocket* sock)
   {
      auto found = std::find_if(m_clients.begin(), m_clients.end(), [&](CSimpleClientContext* cmp) { return &cmp->getSocket() == sock;} );
      if (found != m_clients.end())
      {
            delete *found;
            m_clients.erase(found);
      }
   }
private:
   CSimpleIOCPPool                     m_iocpPool;
   std::list<CSimpleClientContext*>    m_clients;
   std::list<CSimpleServer*>           m_servers;
}; 

Now, let's go line by line and explain what they are for. First, you see  declaration of CSimpleServer, which is derived from CSimpleServerSocket. In most of the cases you'll never need this, as you'll have only one server in your application, or, all servers will behave the same way. However, in this example we need to somehow distinguish servers by their function, i.e. some echo servers will use plain communications, and some will use SSL communications. Of course, in accepting function we still can distinguish these servers by checking their bound address and port, but in this case inheritance is C++ way of doing things (and simpler, by the way). So, we just added flag member indicating whether sockets connected to this server should use SSL or not.

Next,  we declared CSimpleClientContext. As you can see, this is just a stub, that has only one member - CSimpleSocket. Of course, for the purposes of the echo server this is not needed, and we could have used just CSimpleSocket itself. But I wanted to illustrate one way of how to create client context and the use it.  

 And here we finally came to the CEchoServer. Let's look at it's constructor body: 

m_iocpPool += DELEGATE(onIOCPEvent);  
m_iocpPool += DELEGATE(onSocketEvent);  

This two lines establish callback functions for the m_iocpPool.  CSimpleIOCP (as well as pool of them) has two callback function types: for IOCP events, and socket events. You always should implement at least sockets events callback. IOCP events callback is not required, but often is very helpful. Also, there is a (possible) trick you can use (although I don't see the reason): each individual member instance of CSimpleIOCP inside of CSimpleIOCPPool can have different callback. You can set that callback at any time after member instance is created, for example: 

m_iocpPool.getAt(1) += DELEGATE(onIOCPEvent1); 
m_iocpPool.getAt(2) += DELEGATE(onIOCPEvent2); 
...

Now let's skip the destructor - obviously it calls cleanup code - and look at the startServer function. The following two lines setup the server behavior: 

*server += DELEGATE(onServerAccept); 
*server += DELEGATE(onConnectionVerify);   

First one sets up a callback function to call when listening server accepts connection. If you look at the implementation of onServerAccept just down below, you'd see that all it does - calls  

m_iocpPool.addSocket(acceptedSocket, &server);  

 and returns result of that operation. Most of the time onServerAccept will be exact copy of this code. Of course, you can add additional processing in this function, but it's better done in another place, because onServerAccept is called from the server thread. The addSocket function of the pool (as well as of single CSimpleIOCP) will in turn call  onIOCPEvent delegate with eventID parameter set to "SCD_SOCKET_ADD", and that's where it's the most convenient place to do additional processing, because it is called from IOCP thread, and not a server one.  

Most of the time onServerAccept will be the only  delegate you need to implement for the server. But sometimes you need to filter incoming connections based on some situation. For example, you want to blacklist some IPs, or limit server load (i.e. don't accept more than X connections), or don't allow new connection if QoS is low, or something else. In this case you'll provide delegate that will return "true/false" depending on whether you want accept connection or not. onConnectionVerify is an example of such delegate. In this example code we blacklist all connections from IP "127.0.0.15". 

And next line in is pretty obvious - we start server on the specified IP/port: 

server->startServerOn(pAddr, uPort) 

Now to the cleanup code (stop() function).  It is important to follow the specified cleanup order: first we bring down all servers, so no new incoming connections will cause calls to our IOCPs. (Deletion of the server object also closes it). Then we call cleanup of the pool - close() function will properly destroy all internal "overlapped" buffers. Thing to note: call of close() on CSimpleIOCPPool (or single CSimpleIOCP) will in turn call onIOCPEvent delegate (for each IOCP member instance) with eventID parameter set to "SCD_SOCKET_IOCP_STOP_REQUESTED", and that's where you have to call close() on all attached sockets. IOCP by itself does not keep the list of attached sockets (that would've required some synchronization, which we are trying to avoid), but it provides you a callback informing that it's time to close.

Let's proceed to the next function: setIOCPInitialCount. All it does - sets initial count of CSimpleIOCP in CSimpleIOCPPool. Pool is created with zero IOCPs, so you need to set count to at least  one to allow IOCP processing. Of course, this number can be as big as you want. In practice, it is recommended to have 1 IOCP per 1000-5000 sockets. So, if you anticipate load of 1 mln simultaneously connected sockets, you'll set this number to 200 (although, this would mean that you'll need at least 4-CPU system). 

 And now we came to the two functions (delegates) that do the actual work. Let's start with onIOCPEvent. In this delegate eventID parameter can have the following values:

SCD_SOCKET_ADD
SCD_SOCKET_IOCP_STOP_REQUESTED
SCD_SOCKET_USER_EVENT  

As been previously described,  SCD_SOCKET_ADD event comes from the call to addSocket, and that's where we establish association with SOCKET (which is already connected) sent from the server and our IOCP:

sock.initSocket(iocp, eventData->data.sockData.sock) 

First parameter is pointer to the IOCP member that pool decided to use (more on that later). If initSocket is called with this parameter only, socket object will create new, not connected SOCKET and attach it to the provided IOCP pointer. But, if initSocket's second parameter is a valid SOCKET, it will take the ownership of it and attach that SOCKET to provided IOCP. In our case, SOCKET is already created by server and is connected, so we just attach it to IOCP. Keep in mind, that if call to initSocket fails, it means it is not attached to IOCP, and close() event will not be called, so the only option we have here is to destroy our SOCKET using windows API:

if (sock.initSocket(iocp, eventData->data.sockData.sock))
{
 ...
}
else
 ::closesocket(eventData->data.sockData.sock);  

Rest of the code for SCD_SOCKET_ADD is pretty simple:  we either call initSSL() function on our new socket object (if it should support SSL conversation), or start receiving  data on the socket via call to receive() function. Speaking of "receive" function, it has two parameters: the maximum size of data chunk it can receive, and pointer to the actual buffer where to receive the data to.  As you can see, in this example first parameter is set to getDefaultBufferSize() of the socket. The second parameter is an interesting one. If socket is blocking, this parameter cannot be NULL - it's where data will be placed for blocking receive. For non-blocking call (IOCP) this parameter can be omitted (like in this example). If it is not provided, data is received into internal buffer, and then you access it during SCD_SOCKET_READ event in onSocketEvent delegate via "overlapped->buffer" variable. This is the easiest way of getting/sending data, but this is not the most efficient way - you'll have to copy data from "overlapped->buffer" variable into your client context or somewhere else for further processing. So, for non-SSL connections, you have an option to specify your own buffer and pass it in place of the second parameter of receive() function, and data will be placed directly there instead of  "overlapped->buffer", so you can save CPU resources on copying. 

The initSSL() function initiates SSL handshake. First parameter (bool) specifies, whether this should be server-side sequence ( "false") or client side ("true"). Second (optional) parameter specifies certification context, if you have any. If it is not provided (NULL is passed), system will generate self-signed certificate and use it. Two last parameters ("leftoverData" and "blockingResult") are used only for blocking calls. They are ignored for IOCP socket, but let's cover them anyway. Once handshake completes, there still can be some data past the handshake data. For example, other party could have sent some additional encoded data together with it's last handshake part, and that additional data will arrive together with the handshake data. That additional data will be attempted to be decoded. Decoded result (if any) will be placed into "blockingResult", and anything past that (still encoded) will be placed into "leftoverData". For (next) calls to the  receiveSSL() function you should always supply any non-decoded leftover data in place of second parameter, otherwise result will be screwed (part of data to decode will be missing). 

As mentioned before, SCD_SOCKET_IOCP_STOP_REQUESTED event is sent once for each IOCP when close() function is called on that IOCP (or when requestStop() is called on the IOCP). It's user's responsibility to react to this event by closing all sockets related to that IOCP; otherwise close() function might wait forever for the sockets to become closed. 

The last possible even for this handler is SCD_SOCKET_USER_EVENT. This example does not use it, but it can be quite helpful. You submit it by calling postUserEvent() function for the IOCP. When you call  postUserEvent() on the pool, this even is broadcast to each IOCP in the pool. The reason to use it - you can execute something in the context of the IOCP thread instead of your (working/main) thread.

How IOCP pool decides which IOCP member to use during addSocket() call? It chooses the IOCP with the minimum value of  m_load member of IOCP. You can set it's value by setLoad() function for an individual IOCP. This number can be number of sockets, or bytes processed by this IOCP in last X minutes, or something else. Keep in mind, that if you never call setLoad(), m_load is always zero, and IOCP pool will always choose the same member, even if you have more than one. So, it is important to somehow indicate the load for IOCP. The simplest way is to make this number equal to number of attached sockets, i.e. increase it during SCD_SOCKET_ADD event, and decrease during SCD_SOCKET_CLOSE_COMPLETED event  (in the  onSocketEvent delegate). 

Now let's move to the onSocketEvent delegate.  In this delegate eventID parameter can have the following values:

SCD_SOCKET_CONNECTED 
SCD_SOCKET_READ
SCD_SOCKET_WRITE
SCD_SOCKET_SSL_INIT_COMPLETED
SCD_SOCKET_CLOSE_COMPLETED 

The SCD_SOCKET_CONNECTED event is not fired for the server, as SOCKET is already connected, so in our example echo server code we don't track this event. But for client code you want keep track of this event, so you can initiate your program logic (see use cases for the example). Now let's start with simple one: SCD_SOCKET_WRITE event. In practice, you never have any reason to react to this event - it just informs you that writing to the socket has been completed, and IOCP sent data to the TCP layer. The data that was written is available in "overlapped->buffer", and number of bytes written is recorded in "overlapped->bytesPassed" variable.  

SCD_SOCKET_CLOSE_COMPLETED was covered in previous paragraphs. Here let's just mention that it is absolutely safe to delete socket once this event was received - at this point socket is detached from IOCP and completely closed. In example code you see that socket is just deleted on this event.

SCD_SOCKET_SSL_INIT_COMPLETED event is received when SSL handshake has been completed. If you did read description of the initSSL() function above,  you should remember two parameters for blocking call - "leftoverData" and "blockingResult". Well, for non-blocking call (IOCP) you will always get decoded data in  "overlapped->buffer", and leftover data (if any) in "overlapped->ssl_leftover". In this example we ignore initial decoded data (as there shouldn't be any), and just start reading incoming stream. 

And now to the most interesting event: SCD_SOCKET_READ. Here we get received data. For non-SSL process this is easy - you just got data in "overlapped->buffer" (or in the buffer that you supplied during the call to receive() function), it's length in "overlapped->bytesPassed" and then process the data. For SSL read this a little more complicated. Socket tries to receive enough data to decode it. If it receives garbage, or unable to decode at all, it will close the connection, and never will call SCD_SOCKET_READ event. But, if it able to decode received data, it will place decoded data into "overlapped->buffer" and it's length into "overlapped->bytesPassed" (just like a regular read), and anything past that will be placed into "overlapped->ssl_leftover". That leftover should be supplied to the next call of receiveSSL() function. 

As you can see, in the example I moved the actual "echoing" code to the doEcho() virtual function, so you can play with it via inheritance. For example, you can try to echo reversed data (i.e. reverse every 10 received bytes and send them back that way).

Now once we've done with the server, most of functionality has been explained, and you should have no problem understanding the client code. Here is an example of plain non-SSL IOCP client:

void useCase3()
{
   class CMyActivity
   {
   public:
      void  onSocketEvent(CSimpleSocket* sock, SIMPLEWSAOVERLAPPED* overlapped, DWORD socketEvent)
      {
         switch (socketEvent)
         {
            case SCD_SOCKET_CONNECTED:
               m_count = 0;
               if (sock->send("test0") && sock->receive(sock->getDefaultBufferSize()))
                  return;
               sock->close(); // in case of failure - exit
               break;
            case SCD_SOCKET_READ:               
               std::cout << "client received:" << overlapped->buffer.c_str() << "\n";
               m_count++;
               if (m_count < 10)
               {
                  std::string msg = "test" + std::to_string((long long)m_count);
                  if (sock->send(msg) && sock->receive(sock->getDefaultBufferSize()))
                     return;
                  std::cout << "client failure on SCD_SOCKET_READ, iteration" << m_count << "\n";
               }
               sock->close(); // done sending/receiving
               break;
            case SCD_SOCKET_WRITE:
               std::cout << "client sent:" << overlapped->buffer.c_str() << "\n";
               break;
            case SCD_SOCKET_CLOSE_COMPLETED:
               m_event.signalEvent(); // in case of failure - exit
               break;
         }
      }      
      CSimpleAutoEvent  m_event; //member event:
      int m_count;
   };
   CSimpleIOCP    iocp;
   CSimpleSocket  sockClient;
   CMyActivity    activity;
   iocp += DELEGATE_ANY(&activity, CMyActivity::onSocketEvent);
   iocp.init();
   if (sockClient.connect("127.0.0.1", 27015, &iocp))
      activity.m_event.waitForEvent();
   iocp.close();   
} 

You can find other examples in the download. 

One of functions that were not covered in this article is getAdapterInfoVector(). It returns vector of an active network adapters with some (limited) information about them. Right now it has only IPV4 address and MTU size. This vector is populated once during network start up and then is returned every time  you call this function. If you want to repopulate the list (say, an adapter was hot-plugged), call this function with parameter "repopulate" set to "true". 

 Also, you can bind a client to specific adapter. For example, you can change the above client example to

sockClient.connect("127.0.0.1", 27015, &iocp, "127.0.0.15")

 

Files included in download: 

1. SimpleNet.h – the actual IOCP sockets implementation

2. FastDelegate.h – delegates callback functionality (Don Clugston code)

3. SimpleStorage.h – helper file for buffered interlocked list implementation (used as a basis for overlapped structure management)

4. SimpleCriticalSection.h – helper file for lightweight critical section implementation (used in CSimpleNetManager for net initialization etc.)

5. SimpleThreadDefines.h – helper file for compiler intrinsics 

So, for your project you’ll need only these 5 files. Download has 2 more files for your convenience:

6. SimpleNetTest.cpp – usage example and few test cases

7. SimpleNetTest.vcxproj – project file for VC10.

Current Issues 

1. No documentation. The only documentation in existence is this article, and many functions are not covered. Hopefully, though, they are self-explanatory, and usage examples are sufficient.

2. Only IPV4 is currently supported, although change for IPV6 should be very easy.

3. Adapter info can and should be extended. Right now it provides only IPV4 address and MTU size (which becomes default send/receive packet size for the socket).

4. More service functions are needed, such as socket options setting/resetting.

5. Only TCP streaming protocol is supported. I’m not sure if I ever would implement UDP protocol.

6. SSL data is verified only against header correctness and maximum length. If bogus data is sent with correct header and length less than maximum, this situation is not detectable, as Windows SSL decoding function does not detect it. 

On a side note, I’m not sure if I’ll ever continue to develop this class. In light of recent revelations, SSL protocol itself, as well, of course, it’s implementation by Windows, has backdoors for NSA, so there is absolutely no sense to use it, unless you don’t care about security of your data. For example, you can use this class for game server or client, but I would strongly suggest never, ever use Windows or SSL for any type of sensitive information, as it is totally transparent to USA government. So, personally, I hope to move to Linux world some time in a future.

History 

Version 1.0 - initial implementation. 

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Author

Kosta Cherry

United States United States
No Biography provided

Comments and Discussions

 
-- There are no messages in this forum --
| Advertise | Privacy | Mobile
Web04 | 2.8.140916.1 | Last Updated 17 Sep 2013
Article Copyright 2013 by Kosta Cherry
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid