A data communication network, or simply, data network, facilitates data transfers between two or more data processing systems. For example, an application executing in one data processing system acts as the sender of the data, and another application executing in another data processing system acts as the receiver of the data. Between the sender system and the receiver system, the data follows a data path that comprises one or more links between networking components, such as routers and switches.
In a data processing environment, such as in a datacenter, many data processing systems are connected via a data network. At any given time, several systems may be transmitting data of various sizes to several other systems. Many of these data transmissions can utilize a common link in the network, to get from their respective sender systems to their respective receiver systems.
A data communication link in a network can become congested when more than a threshold amount of data traffic tries to use the link during a given period. The data traffic of some data flows (hereinafter, “flow”, or “flows”) appears in bursts, causing the data traffic on a link to spike. A link can also be over-subscribed, i.e., too many flows may try to use the link at a given time. Packet loss, increased network latency, and timeouts are some examples of problems that are caused when the utilization of a link exceeds a threshold and congestion occurs.
Some flows in a network are small flows and some are large flows. A flow that transmits less than a threshold amount of data in a given period is a small flow. A flow that transmits the threshold amount of data or more in a given period is a large flow. The data of a flow comprises packets of data. Generally, the larger the flow, the more the number of the packets therein. The packets of the various flows wanting to use a link are queued.
In many datacenters, a sending system, a receiving system, or both can be virtual machines. A virtual machine (VM) comprises virtualized representations of real hardware, software, and firmware components available in a host data processing system. The data processing system can have any number of VMs configured thereon, and utilizing any number of virtualized components therein.
For example, the host may include a processor component. One virtual representation of the processor can be assigned to one VM, and another virtual representation of the same processor can be assigned to another VM, both VMs executing on the host. Furthermore, the second VM may also have access to a virtual representation of a reserve processor in the host and certain other resources, either exclusively or in a shared manner with the first VM.
Certain data processing systems are configured to process several workloads simultaneously. For example, separate virtual data processing systems, such as separate VMs, configured on a single host data processing system often process separate workloads for different clients or applications.
In large scale data processing environments, such as in a datacenter, thousands of VMs can be operating on a host at any given time, and hundreds if not thousands of such hosts may be operational in the datacenter at the time. A virtualized data processing environment such as the described datacenter is often referred to as a “cloud” that provides computing resources and computing services to several clients on an as-needed basis.
Congestion control is a process of limiting or reducing data congestion in a section of a network, such as at a networking device or in a link. Presently, congestion control is a function of the Transmission Control Protocol/Internet Protocol (TCP/IP) stack. The TCP/IP stack is implemented by an operating system, and different operating systems implement congestion control differently. For example, one operating system might use one algorithm for performing congestion control whereas a different operating system might implement a different algorithm for the same purpose. Even a single operating system can implement different congestion control algorithms, and the ones that are implemented can be configurable to exhibit different behaviors.