Ethernet is a family of computer network standards that are widely used in wired local area networks (LANs). These standards have been codified by the IEEE 802.3 working group and define a wide range of link-layer protocol features and medium access control (MAC) functions. The Ethernet link-layer protocol may run over a variety of underlying physical-layer (PHY) types and protocols.
In packet-switched networks, such as Ethernet, switches have buffers that facilitate lossless operation. When the rate of incoming packet rates from a source is higher than the switch can accommodate, however, data can accumulate in the buffer, and packets may be dropped due to exceeding the buffer size. To ameliorate this problem, Ethernet switches send link-level flow-control messages when the buffer fill level of a particular queue or ingress port and priority exceeds a specified threshold, called the XOFF threshold. The flow-control message is sent to the source of the packets to instruct the source to stop transmitting packets.
For this purpose, Annex 31B of the IEEE 802.3 specification defines an optional flow control operation using “PAUSE” frames. When the receiver on a given link transmits a PAUSE frame to the transmitter, it causes the transmitter to temporarily stop all transmission on the link (except certain control frames) for a period of time that is specified in the PAUSE frame. This pause command mechanism enables the receiver to recover from states of buffer overfill.
Recently, a number of new IEEE standards for data center bridging (DCB) have been proposed, offering enhanced Ethernet flow control capabilities. For example, the IEEE 802.1Qbb project authorization request (PAR) provides priority-based flow control (PFC) as an enhancement to the pause mechanism described above. PFC creates eight separate virtual links on a given physical link and allows the receiver to issue commands that pause and restart the virtual links independently. PFC thus enables the operator to implement differentiated quality of service (QoS) policies for the eight virtual links, meaning that each virtual link may correspond to a different class of service.
Due to delays in receiving and acting on flow-control messages at the transmitter, the receiving switch (also referred to in this context simply as the “receiver”) will continue receiving frames from the source for a certain amount of time even after transmitting the XOFF (PAUSE) message. In view of this delay, a portion of the switch buffer is normally reserved to admit the packets that may arrive after the flow-control message is sent. This reserved buffer is referred to as the lossless headroom or, simply, headroom. Typically, each virtual link receives its own headroom allocation, although in some schemes, headroom allocations are shared among different virtual links and even different ports.
A receiver using PFC must predict when the headroom buffer of any given virtual link is nearing exhaustion and send a PAUSE frame to the transmitter when this condition arises. There will always be a certain lag, however, between when the PAUSE frame is generated and when the transmitter actually stops transmission. For lossless service, the receiver must have sufficient residual buffer space available to store any residual packets that the transmitter has transmitted or is in the process of transmitting between the time the receiver decides to send the PAUSE frame and the time the transmitter actually receives the PAUSE frame and ceases transmission.
The buffer requirements for this sort case are analyzed in a white paper published by Cisco Systems Inc. (San Jose, Calif.), entitled “Priority Flow Control: Build Reliable Layer 2 Infrastructure” (September, 2015). The paper explains that the receiver must send a PAUSE frame for a given class of service (CoS) when the receiver buffer space drops below a certain threshold, which is determined by the sum of the maximum transmission unit (MTU) sizes of both the transmitter and the receiver, the response time of the transmitter, and the speed (or equivalently, bandwidth) and length of the wire between the transmitter and receiver. The PFC standard limits the response time contribution to the threshold to 3840 bytes, while at 10 Gbps, the round-trip transmission time of the wire contributes another 1300 bytes to the threshold per 100 meters of cable. The threshold may also be affected by the sizes of the buffer units (referred to as “cells”) that are allocated to hold incoming packets, such as 80 or 160 bytes, in comparison with the sizes of the packets that may be received and stored.