This article introduces an open source UDP-based data transfer library, namely UDT (UDP-based Data Transfer).
Currently, there are two major Internet transport protocols, TCP and UDP. TCP provides connection oriented reliable data streaming service, whereas UDP provides connection-less unreliable messaging service. Most applications use TCP to transfer data because they require data reliability.
However, when using TCP some applications suffer from two problems. First, TCP demonstrates poor performance in long distance links, especially when the bandwidth is high (e.g., 1GB/s or higher). Second, when incorporating multiple concurrent TCP flows with different RTTs in the same application, different TCP flows may have different data transfer rates (also known as RTT bias).
There are other situations where people do not want to use either TCP or UDP. For example, an application may require data reliability (so cannot use UDP directly) but also wants data to arrive at a user-controlled rate or the message boundaries to be conserved. Furthermore, sometimes it is easier to use UDP to traverse firewalls than TCP.
It may be helpful to build an open source user space transport protocol above UDP, but with the data reliability control and network flow/congestion control. Because it is at user space, it is easy to get installed. Because it is open source, it can be easily modified to meet various requirements from applications.
Here we present two example applications in which you may need UDT.
a. Bulk data transfer and online streaming data processing
Suppose a company has their data stored in multiple branches around the world, each branch having its own part of the dataset. The datasets are very large (terabytes) and the company has a 1GB/s intranet to deliver them. Now one of its departments at Chicago branch wants to analyze its own dataset and the dataset of the London branch. This data analysis application will need to read two datasets (at London and Chicago, respectively) using the company's 1GB/s intranet to a computer at Chicago.
Recall the two problems of TCP mentioned above. First, the link must be extremely clean (very little packet loss) for TCP to fully utilize the 1GB/s bandwidth between London and Chicago. Second, when the two TCP streams (London->Chicago, Chicago->Chicago) start at the same time, the London-Chicago stream will be starved due to the RTT bias problem, thus the data analysis process will have to wait for the slower data stream.
b. Application awareness
A streaming video server is sending data to many clients. The server may choose a specific sending rate for each client during any specific period. You may have trouble asking TCP to send data at a fixed rate. And using UDP you have to do most of the data reliability control work by yourself. In addition, video frames are time sensitive data, if the frame is delayed too long, it will not be needed any more. That is, complete data reliability is not desired in this situation.
Using the UDT library, the programming work is simply several lines of option setting code.
UDT is a transport protocol with its own reliability control and flow/congestion control built above UDP. UDT provides both reliable data streaming service quite similar to TCP and partial reliable messaging. The latter allows users to send data as messages with specified delivery order and time-to-live value.
The position of UDT in the layered architecture of internet reference model is shown in the figure below. Applications use UDT socket to transfer their data, which is passed to the UDP socket. (Effort was made to avoid one time memory copy here.) A congestion control (CC) algorithm can be provided by applications; otherwise the default control algorithm is used. If user-defined control algorithm is provided, UDT will call the callback functions in CC once a control event occurs:
The default UDT control algorithm is designed for high performance data transfer over wide area high speed networks (e.g. in the case of example a). However, UDT's congestion control algorithm is configurable such that you can add your own control algorithm with ease (e.g., in the case of example b).
UDT provides a set of socket-like API. Using UDT in your application is simple as long as you are familiar with BSD socket. For example: With BSD socket, you write:
int s = socket(AF_INET, SOCK_STREAM, 0);
Its counterpart in UDT is:
UDTSOCKET u = UDT::socket(AF_INET, SOCK_STREAM, 0);
UDT is currently implemented using C++ and it supports Linux, Windows (2000, XP and above), and OS X. The current version is 3.0 Beta. After you download UDT, please read the "readme" file or the documentation for detailed installation and usage information. Note that if you are using Windows, you will need VC7 to get it compiled. With VC6, you will have to fix several incompatible C++ grammar by yourself.
If you are interested in the project, please refer to this site for more information.