A data network facilitates data transfers between two or more data processing systems. For example, an application executing in one data processing system acts as the sender of the data, and another application executing in another data processing system acts as the receiver of the data. Between the sender system (also referred to herein as “host” or “sender node”) and the receiver system (also referred to herein as “receiver node”), the data follows a data path that comprises one or more links between networking components, such as routers and switches.
Within a data processing system, such as in the sender system, the sender application typically hands-off the data to some functionality in the system that manages the data flows in and out of the system.
For example, in a Transmission Control Protocol (TCP) implementation, a sender application hands off the data to a Transmission Control Protocol (TCP) stack. Within the layers of the protocol stack, the data is broken down into segments of data. The segments of data eventually are further broken down into data packets, and the data packets leave the sender system via a physical Ethernet adapter configured in the system.
A physical Ethernet adapter is a type of a physical network adapter device, also known as a network interface card (NIC), which comprises hardware components such as a memory device, firmware components such as code persistently loaded in one such memory device, software components such as a device driver and software components called or used by the device driver executing to facilitate the operations of the adapter within a given operating system. Hereinafter, a reference to an “adapter, a “card”, or a “NIC” refers to the combination of the hardware, firmware, and software of a network interface card configured for data communications using packetized data according to any suitable data communication protocol, including but not limited to TCP.
Once the data packets leave the sender system, the data packets travel on one or more data communication links to one or more networking components, and eventually reach the receiver system. Particularly, the data packets are received by a NIC in the receiver system. The NIC reassembles the data packets into segments and pushes the segments up to a protocol stack. Eventually, a receiver application in the receiver system receives the data that has been reconstructed from the received segments.
In a data processing environment, such as in a datacenter, many data processing systems are connected via a data network. At any given time, several systems may be transmitting data of various sizes to several other systems. Many of these data transmissions can utilize a common link in the network, to get their packets from their respective sender systems to their respective receiver systems.
A data communication link in a network can become congested when more than a threshold amount of data traffic tries to use the link during a given period. The data traffic of some data flows (hereinafter, “flow”, or “flows”) appears in bursts, causing the data traffic on a link to spike. A link can also be over-subscribed, i.e., too many flows may try to use the link at a given time. Flow collisions, packet loss, network latency, and timeouts are some examples of problems that are caused when the utilization of a link exceeds a threshold.
Some flows in a network are small flows and some are large flows. A flow that transmits less than a threshold amount of data in a given period is a small flow. A flow that transmits the threshold amount of data or more in a given period is a large flow.
The data packets of the various flows wanting to use a link are queued. For using a link, small flow packets that are queued after the packets of a large flow will have to wait significantly longer to use the link, as compared to when the small flow packets are queued after the packets of another small flow. Typically, over a period of operation in a data network, small flows outnumber large flows but data transmitted by large flows exceeds the data transmitted by small flows. Thus, the use of communication links in a network by a mix of large and small flows often results in unacceptable performance of applications and operations related to the small flows, because of the large flows.
Furthermore, the data packets of a particular flow can travel on different links. Different links can have different latencies and congestion levels. Accordingly, different packets of the same flow can start from a sender system in sequence, but arrive at the receiver system out of sequence. Some data packets can be lost in the data network for a variety of reasons. A lost data packet either never reaches the receiver system, or does not reach the receiver system in a timely manner.