Click here to Skip to main content
Click here to Skip to main content

The Differences Between Network Calls in Windows and Linux

, , 29 Dec 2010 CPOL
Rate this:
Please Sign up or sign in to vote.
This article describes the differences in sockets usage under BSD and Windows
cross_sockets.png

Berkeley and Microsoft socket models that are mostly compatible on the source code level are not so cross-platform in practice.

Let’s examine some subtle differences in their implementation. These differences were found when writing a cross-platform RPC for redirection of network calls of some process from one OS to another.

Socket Types

  1. BSD:
    int
  2. Win:
    void * // macros SOCKET

While the processor capacity is 32 bits, there are no problems in mutual displaying. On Windows 64 bits, the SOCKET type is twice larger in size.

The socket descriptor on BSD does not differ from the file descriptor. It means that some system calls accept descriptors of sockets and files simultaneously (for example, such commonly used calls as close(), fcntl(), and ioctl()).

There is one more side effect that appears in some cases. The matter is that systems, which support Berkeley model, have a small numerical value of the socket descriptor (less than 100) and the descriptors that are created in succession differ on 1. In the Microsoft model, such descriptors have values that are approximately more than 200 at once, and the descriptors created in succession differ on sizeof(SOCKET).

Error Handling

  1. BSD: Calls return -1, global variable errno is set.
  2. Win: Calls return -1 (SOCKET_ERROR macro), we receive the status with WSAGetLastError().

errno constants and Windows error codes have absolutely different values.

Socket Creation

  socket(int af, int type, int protocol);

Constants for the first argument have absolutely different values on BSD and Windows. Constants for the second argument coincide so far.

Socket Setting

  1. BSD:
      getsockopt(int sockfd, int level, int option_name, 
    	void *option_value, socklen_t *option_len); 
      setsockopt(int sockfd, int level, int option_name, 
    	void const *option_value, socklen_t option_len); 
  2. Win:
      getsockopt(SOCKET sock, int level, int option_name, 
    	void *option_value,  socklen_t *option_len);
      setsockopt(SOCKET sock, int level, int option_name, 
    	void const *option_value,  socklen_t option_len) 

Flag constants for the second and third arguments have absolutely different values on BSD and Windows.

Socket Setting 2

  1. BSD:
     fcntl(int fd, int cmd, ...); 
  2. Win:
     ioctlsocket(SOCKET sock, long cmd, long unsigned *arg); 

The only completely correct correspondence is as follows:

  fcnlt(descriptor, F_SETFL, O_NONBLOCK) -> ioctlsocket
	(descriptor, FIONBIO, address of the variable with the O_NONBLOCK  value). 

Flag numerical values should be considered in regard to the target system (they are different on BSD and Windows systems).

At the same time, we can return 0 or O_RDWR for the call of the fcnlt(descriptor, F_GETFL) type.

Socket Setting 3

  1. BSD:
     ioctl(int fd, int cmd, ...); 
  2. Win:
     ioctlsocket(SOCKET sock, long cmd, long unsigned *arg); 

The cases of real usage of ioctl() with the socket as the first argument have not been discovered so far.

Work with DNS

  getaddrinfo(char const *node, char const *service, 
	struct addrinfo const *hints, struct addrinfo **res) 
  1. BSD:
    struct addrinfo
    {
        int ai_flags;
        int ai_family;
        int ai_socktype;
        int ai_protocol;
        socklen_t ai_addrlen;
        struct sockaddr *ai_addr;
        char *ai_canonname;
        struct addrinfo *ai_next;
    };
  2. Win:
    typedef struct addrinfo
    {
        int ai_flags;
        int ai_family;
        int ai_socktype;
        int ai_protocol;
        size_t ai_addrlen;
        char *ai_canonname;
        struct sockaddr_ *ai_addr;
        struct addrinfo_ *ai_next;
    } ADDRINFOA, *PADDRINFOA;

Pay attention to the invariants of these structures. ai_addr and ai_canonname have different offsets from the beginning of the structure. Developers just rearranged them (or mixed up?).

Data Transfer

  1. BSD:
      recv(int sockfd, void  *buffer, size_t length, int flags);
      recvfrom(int sockfd,  void *buffer, size_t length, 
    	int flags, struct sockaddr *from, socklen_t  *fromlen);
      send(int sockfd, void  const *buffer, size_t length, int flags);
      sendto(int sockfd, void  const *buffer, size_t length, 
    	int flags, struct sockaddr const *to, socklen_t  tolen);
  2. Win:
      recv(SOCKET sock, void  *buffer, size_t length, int flags);
      recvfrom(SOCKET sock,  void *buffer, size_t length, 
    	int flags, struct sockaddr *from, socklen_t  *fromlen);
      send(SOCKET sock, void  const *buffer, size_t length, int flags);
      sendto(SOCKET sock,  void const *buffer, size_t length, 
    	int flags, struct sockaddr const *to,  socklen_t tolen);

Flags for the fourth argument have absolutely different values on BSD and Windows.

Waiting for Operations

  1. BSD:
     poll(struct pollfd *fds, nfds_t nfds, int  timeout);
    struct pollfd
    {
        int fd;
        short events;
        short revents;
    };
  2. Win:

     WSAPoll(struct pollfd *fds, nfds_t nfds, int  timeout);
    typedef struct pollfd
    {
        SOCKET sock;
    	WORD events;
        WORD revents;
    } WSAPOLLFD, *PWSAPOLLFD;

Flag constants for the second and third invariants of the pollfd structure have absolutely different values on BSD and Windows. WSAPoll() is present only in Windows of the 6th version (Vista) and higher.

Waiting for Operations 2

  1. BSD:
    select(int nfds, fd_set *readfds, fd_set *writefds, 
    	fd_set  *errorfds, struct timeval *timeout); 
    typedef struct
    {
    	  	  long fds_bits[FD_SETSIZE / 8 * sizeof(long)];
    } fd_set;
  2. Win:
     select(int nfds, FDSET *readfds, FDSET *writefds, 
    	FDSET *errorfds,  struct timeval *timeout); 
    typedef struct fd_set
           {
              unsigned fd_count;
              SOCKET fd_array[FD_SETSIZE];
    } FDSET, *PFDSET;

The problem in the select procedure appears while mutual reflection of the fd_set structure. Let’s recollect how select() works. This call accepts three sets of sockets: for checking reading, writing, and errors during some period of time. You can add your own socket for checking to one of these sets via the FD_SET(socket, set) macro. To check the socket on being installed, use the FD_ISSET(socket, set) macro; to delete one socket from the set, use the FD_CLR(socket, set) macro; to delete all sockets, use the FD_ZERO(set) macro. After the call, select() leaves only those sockets in the corresponding sets, which got the expected state during the time out defined by the last argument.

For BSD, adding of some socket to some set consists in setting its bit which number is equal to the socket descriptor. FD_SETSIZE is usually equal to 1024. The first select() argument is one bigger than the maximum numerical value of the socket descriptor that is a part of any of three sets. Taking into account that setting of a bit in the fds_bits array is performed without the check of range, it becomes clear that the program behavior is undefined with the socket descriptor value equal to or greater than FD_SETSIZE. Such rather unreliable implementation of select is a remnant of computers with little memory. Besides, in such case, an indirect conversion of int -> SOCKET and vice versa is important.

For Windows, adding of some socket to some set consists in its insertion to the fd_array array by the fd_count index and the further increase of the latter one. FD_SETSIZE is usually equal to 64. At the same time, the first select() argument is skipped at all.

Implementation Details

Here is a certain useful code that I used in my project.

First, it is supposed that we somehow managed to redirect standard network calls to the GLibC library to our implementations (for example, see http://apriorit.com/our-company/dev-blog/181-elf-hook). Besides, we have some mechanism of a synchronous RPC that performs the serialization of parameters and the transfer of calls from Linux to Windows. Also, there are declarations of all required Windows constants so that they do not cross with Linux analogs.

As socket types are different on the systems, the following class for converting of descriptors during the call transfer will prove useful:

class SocketsStorage
{
public :
    bool hasSocket(int const fd);
    int addSocket(SOCKET const handle);
    void removeSocket(int const fd);
    SOCKET convert(int const fd);
    int convert(SOCKET const handle);
    
private :
    typedef std::map<int, SOCKET> sockets_map;
    sockets_map map_;
};

Its implementation can look as follows:

bool SocketsStorage::hasSocket(int const fd)
{
    sockets_map::iterator i = map_.find(fd);
    
    return map_.end() != i ? true : false;
}

int SocketsStorage::addSocket(SOCKET const handle)
{
    if (INVALID_SOCKET == handle)
        return reinterpret_cast<int>(INVALID_SOCKET);

    static int const min = FD_SETSIZE - 
	FD_SETSIZE / 4;  	//big enough to avoid file descriptors conflict 
			//but less than FD_SETSIZE
    static int const max = FD_SETSIZE;
    
    for (int fd = min; fd < max; ++fd)
    {
        sockets_map::iterator i = map_.find(fd);
            
            if (map_.end() == i)
            {
                map_[fd] = handle;
                
                return fd;
            }
        }
    }
      
    return reinterpret_cast<int>(INVALID_SOCKET);
}

void SocketsStorage::removeSocket(int const fd)
{
    s_sockets.erase(fd);
}

SOCKET SocketsStorage::convert(int const fd)
{
    return hasSocket(fd) ? map_[fd] : INVALID_SOCKET;
}

int SocketsStorage::convert(SOCKET const handle)
{
    sockets_map::iterator i = map_.begin();
    sockets_map::const_iterator end = map_.end();
    
    while (end != i)
        if (socket == (*i).second)
            return (*i).first;
    
    return reinterpret_cast<int>(INVALID_SOCKET);
}

For the created socket, the first pseudodescriptor will have the 768 value. It is rather a lot for real descriptors whose values begin from about 6 but less than FD_SETSIZE to work out select() correctly filling its FD_SET.

Also, we need functions of constants converting for certain calls:

void select2WSASelect(fd_set *bsd, fd_set_ *win, sockets_map &sockets);
void WSASelect2select(fd_set_ *win, fd_set *bsd, sockets_map &sockets);
int WSA2errno(int const WSA);
int domain2WSAdomain(int domain);
int WSAdomain2domain(int domain);
void WSA2sockopt(int *level, int *option);
void sockopt2WSA(int *level, int *option);
short WSAPoll2poll(short const flags);
short poll2WSAPoll(short const flags);
int msgFlags2WSAmsgFlags(int flags);
int WSAmsgFlags2msgFlags(int flags);

Examples of implementation of certain redirected functions are the following:

int socket(int domain, int type, int protocol)
{
    int ret = reinterpret_cast<int>(INVALID_SOCKET);
        
    errno = 0;
    
    RPC_SOCKET_REQUEST request;
    RPC_SOCKET_RESPONSE response;
    
    request.af = BSD2WSA::domain2WSAdomain(domain);
    request.type = type;
    request.protocol = protocol;
        
    if (response = static_cast<RPC_SOCKET_RESPONSE>(sendSyncRequest(request)))
    {
        ret = g_sockets.addSocket(response.socket);
        
        if (INVALID_SOCKET == response.socket)
            errno = SD2WSA::WSA2errno(response->errno);
    }
    else
        errno = EAGAIN;

    return ret;
}

int poll(struct pollfd *fds, nfds_t nfds, int timeout)
{
    int ret = reinterpret_cast<int>(INVALID_SOCKET);
    
    errno = 0;
    
    RPC_POLL_REQUEST request;
    RPC_POLL_RESPONSE response;
    
    request.nfds = nfds;
    request.timeout = timeout;
        
    for (nfds_t i = 0; i < nfds; ++i)
    {
        if (g_sockets.hasSocket(fds[i].fd))
        {
            request.fds[i].fd = g_sockets.getSocket(fds[i].fd);
            request.fds[i].events = BSD2WSA::poll2WSAPoll(fds[i].events);
        }
        else
            request.fds[i].events = 0;
                
        request.fds[i].revents = 0;
    }
        
    if (response = static_cast<RPC_POLL_RESPONSE>(sendSyncRequest(request)))
    {
        ret = response.ret;
        
        if (SOCKET_ERROR == ret)
            errno = BSD2WSA::WSA2errno(response.errno);
        else
            for (nfds_t i = 0; i < nfds; ++i)
                if (g_sockets.hasSocket(fds[i].fd))
                    fds[i].revents = BSD2WSA::WSAPoll2poll(response.fds[i].revents);
    }
    else
        errno = EAGAIN;
       
    return ret;
}

int getaddrinfo(char const *node, char const *service, 
	struct addrinfo const *hints, struct addrinfo **res)
{
    int ret = reinterpret_cast<int>(INVALID_SOCKET);
    
    errno = 0;
   
    RPC_GETADDRINFO_REQUEST request;
    RPC_GETADDRINFO_RESPONSE response;
        
    request.node = node;
    request.service = service;
    request.ai_flags = hints->ai_flags;
    request.ai_family = hints->ai_family;
    request.ai_socktype = hints->ai_socktype;
    request.ai_protocol = hints->ai_protocol;
    request.res = res;
       
    if (response = static_cast<RPC_GETADDRINFO_RESPONSE>(sendSyncRequest(request)))
    {
        ret = response.ret;
        
        if (SOCKET_ERROR == ret)
            errno = BSD2WSA::WSA2errno(response.errno);
        else
        {
            struct addrinfo *q = 0, *prev = 0;
            
            *res = 0;
            
            for (PADDRINFOA p = response.res; p; p = p->ai_next)
            {               
                q = (struct addrinfo *)::malloc(sizeof(struct addrinfo));
                ::memcpy(q, p, sizeof(struct addrinfo));
                
                if (p->ai_addr)
                {
                    q->ai_addr = (struct sockaddr *)::malloc(sizeof(struct sockaddr));
                    ::memcpy(q->ai_addr, p->ai_addr, 
			sizeof(struct sockaddr) > p->ai_addrlen ? 
			p->ai_addrlen : sizeof(struct sockaddr));
                }
                else
                {
                    q->ai_addr = 0;
                    q->ai_addrlen = 0;
                }
                
                if (p->ai_canonname)
                {
                    size_t len = ::strlen(p->ai_canonname);
                    
                    len = len > 0x100 ? 0x100 : len;  	//if there was an error 
						//during transferring
                    
                    q->ai_canonname = (char *)::malloc(len + 1);
                    ::memcpy(q->ai_canonname, p->ai_canonname, len);
                    q->ai_canonname[len] = 0;
                }
                else
                    q->ai_canonname = 0;
                
                q->ai_next = 0;
                
                if (!*res)  //only for the first time
                    *res = q;
                
                if (prev)
                    prev->ai_next = q;
                
                prev = q;
            }
        }
    }
    else
        errno = EAGAIN;
        
        return ret;
}

ssize_t recv(int sockfd, void *buffer, size_t length, int flags)
{
    int ret = reinterpret_cast<int>(INVALID_SOCKET);
    
    errno = 0;
   
    RPC_RECV_REQUEST request;
    RPC_RECV_RESPONSE response;
    
    request.s = g_sockets.convert(sockfd);
    request.len = length;
    request.flags = BSD2WSA::msgFlags2WSAmsgFlags(flags);
    request.buf = buffer;
        
    if (response = static_cast<RPC_RECV_RESPONSE>(sendSyncRequest(request)))
    {
        ret = response.ret;
        
        if (SOCKET_ERROR == ret)
            errno = BSD2WSA::WSA2errno(response.errno);
        else
        {
            if (buffer)
                ::memcpy(buffer, response.buf, ret);
        }
    }
    else
        errno = EAGAIN;
    
    return ret;
}

int close(int fd)
{
    int ret = reinterpret_cast<int>(INVALID_SOCKET);
    
    errno = 0;

    if (g_sockets.hasSocket(fd))
    {
        errno = 0;
        
        RPC_CLOSESOCKET_REQUEST request;
        RPC_CLOSESOCKET_RESPONSE response;

        request.s = g_sockets.convert(fd);

        if (response = static_cast<RPC_CLOSESOCKET_RESPONSE>(sendSyncRequest(request)))
        {
            ret = response.ret;
            
            if (SOCKET_ERROR == ret)
                errno = BSD2WSA::WSA2errno(response.errno);
        }
        else
            errno = EAGAIN;  
    }
    else
    {
        errno = 0;
        ret = ::close(fd);
    }
        
    return ret;
}

The complete example of the code with all definitions is attached to the article.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Share

About the Authors

Apriorit Inc
Apriorit Inc.
Ukraine Ukraine
ApriorIT is a Software Research and Development company that works in advanced knowledge-intensive scopes.
 
Company offers integrated research&development services for the software projects in such directions as Corporate Security, Remote Control, Mobile Development, Embedded Systems, Virtualization, Drivers and others.
 
Official site http://www.apriorit.com
Group type: Organisation

31 members

Follow on   LinkedIn

Anthony Shoumikhin
Software Developer Microsoft
United States United States
No Biography provided
Follow on   Twitter

Comments and Discussions

 
GeneralMy vote of 5 PinmemberVineel Kumar Reddy Kovvuri28-Dec-10 4:39 

General General    News News    Suggestion Suggestion    Question Question    Bug Bug    Answer Answer    Joke Joke    Rant Rant    Admin Admin   

Use Ctrl+Left/Right to switch messages, Ctrl+Up/Down to switch threads, Ctrl+Shift+Left/Right to switch pages.

| Advertise | Privacy | Mobile
Web03 | 2.8.141022.2 | Last Updated 30 Dec 2010
Article Copyright 2010 by Apriorit Inc, Anthony Shoumikhin
Everything else Copyright © CodeProject, 1999-2014
Terms of Service
Layout: fixed | fluid