Unless otherwise indicated herein, the materials described in this section are not prior art to the claims in this application and are not admitted to be prior art by inclusion in this section.
Datacenters are typically composed of storage devices, servers and switches to interconnect the servers within datacenters. Data may be spread across or striped across many servers for performance or reliability reasons. During data access from servers, the data need to pass through datacenter Ethernet switches. The switches typically have small buffers in the range of 32-256 KB, which may be overflowed at congestions. Transport Control Protocol (TCP) throughput collapse phenomenon is referred to as TCP incast and attributed to having multiple senders overwhelming a switch buffer, thus resulting in TCP timeout due to packet drops at the congestion switch. TCP incast may also occur in distributed cluster storage and web-search workloads. Increasing the amount of buffer size can delay the onset of incast. However, any particular switch configuration may have a maximum number of servers involved in simultaneous transmissions prior to incurring throughput collapse.
The main root cause of the TCP incast is due to the packets drops at the congestion switch that result in TCP timeout. Congestion control algorithms that have been developed to reduce or remove packets drops at the congestion switch include, but are not limited to, Backward Congestion Notification (BCN), Forward Explicit Congestion Notification (FECN), the enhanced FECN (E-FECN), and Quantized Congestion Notification (QCN). Among those, BCN achieves proportional fairness but not maxmin fairness. FECN and E-FECN can achieve perfect fairness, but the control message overhead is high. The QCN algorithm aims to provide congestion control at the Ethernet Layer or Layer 2 (L2) in standardized data center networks. QCN can effectively control link rates very rapidly in a datacenter environment.
The present disclosure appreciates that TCP incast, where TCP throughput drastically reduces when multiple sending servers communicate with a single receiver separated by one or more switches or routers in high bandwidth, low latency networks using TCP, potentially arises in many datacenter applications. Algorithms intended to enhance data center network performance such as QCN, however, perform poorly when TCP incast is observed, due to the rate unfairness of different flows.