1. Field of the Invention
This invention relates to data switching networks. More particularly, this invention relates to congestion avoidance and recovery in data switching networks.
2. Description of the Related Art
The meanings of certain acronyms and abbreviations used herein are given in Table 1.
TABLE 1Acronyms and AbbreviationsCEEConverged Enhanced EthernetCLCMCredit Loop Control MessageCPUCentral Processing UnitECMPEqual Cost Multi-Path RoutingFCoEFiber Channel Over EthernetHLLHead-of-queue Lifetime LimitMTUMaximum Transmission UnitRDMARemote Direct Memory AccessSDKSoftware Development KitSDNSoftware Defined NetworksSPUIDSwitch Port-Unique IdentifierSRAMStatic Random Access Memory
Lossless networks are a recent trend in data center networking. Lossless networks provide multiple performance advantages to data center applications. Recently emerging Converged Enhanced Ethernet (CEE) data center networks rely on layer-2 flow control in order to support packet loss-sensitive transport protocols, such as RDMA and FCoE. Even with reliable transport protocols, such as TCP, dropping packets results in bandwidth decrease, significant completion time increase and statistical tail, especially with short-lived flows.
Although lossless networks can improve end-to-end network performance, without careful design and operation, they can suffer from in-network deadlocks, caused by cyclic buffer dependencies. These dependencies are called credit loops and occur in schemes such as InfiniBand, which use a credit-based flow control scheme in hardware. In such schemes a “credit” is transferred by an egress node to an ingress node in a network or fabric. The credit confers a right upon the ingress node to consume a specified portion of the memory of the egress node. Thus, an egress port can only send packets if it has a credit at the destination ingress port.
Given a topology and routing rules, a credit loop is defined as a cyclic sequence of switch buffers, such that every switch in the sequence sends traffic to the next switch in the sequence. This sequence is dependent on the existence of at least one routing rule, wherein a switch forwards packets arriving via a link in the sequence to the next link in the sequence. A credit loop deadlock occurs when all the switch buffers on the sequence become full and no traffic can be propagated.
Credit loops are a silent killer; they may exist for a long time, but deadlocks only with specific traffic patterns. Although existing credit loops rarely deadlock, when they do they can block large parts of the network. Naïve solutions recover from credit loop deadlock by draining buffers and dropping packets. Previous works have suggested credit loop avoidance by central routing algorithms, but these assume specific topologies and are slow to react to failures.
Several methods have been proposed to deal with credit loop deadlocks. Some of these methods consider the loops when defining the forwarding rules. The most common of which is the Up*/Down* algorithm that works well with any treelike topology. According to this algorithm, credit loops are prevented if the forwarding from downstream link to upstream link is prohibited.
Another set of solutions rely on multiple sets of buffers, like Ethernet traffic classes or priorities, and break the loops by changing the buffer pools used by the packets as they overlap with the start of the cycle. However, these methods are all based on a central controller to calculate the forwarding rules that avoid credit loops and thus are slow to respond to failures and work only in SDN-like environments.
As a measure of last resort for dealing with credit loop deadlock, the InfiniBand specification defines a Head-of-queue Lifetime Limit (HLL) mechanism. This mechanism allows switches to drop packets that are waiting in the head of the switch queue for a significant amount of time.