Switches used in high-speed packet networks, such as Ethernet and InfiniBand networks, typically contain buffer memories. Packets received by the switch through one of its interfaces are stored temporarily in a buffer memory while awaiting transfer to the appropriate egress interface or possibly, in the case of multicast packets, to multiple egress interfaces. Although buffer memory may be allocated statically to each interface, many modern packet switches use a shared memory, in which buffer space is allocated dynamically to different interfaces and queues depending on traffic load and memory availability.
In packet-switched networks, such as Ethernet, switches have buffers that facilitate lossless operation. When the rate of incoming packets from a source is higher than the switch can accommodate, however, data can accumulate in the buffer, and packets may be dropped due to buffer overflow. To ameliorate this problem, Ethernet switches send link-level flow-control messages when the buffer fill level of a particular queue or ingress port and priority exceeds a specified threshold, called the XOFF threshold. The flow-control message is sent to the source of the packets to instruct the source to stop transmitting packets.
For this purpose, Annex 31B of the IEEE 802.3 specification defines an optional flow control operation using “PAUSE” frames. When the receiver on a given link transmits a PAUSE frame to the transmitter, it causes the transmitter to temporarily stop all transmission on the link (except certain control frames) for a period of time that is specified in the PAUSE frame. This pause command mechanism enables the receiver to recover from states of buffer overfill.
Recently, a number of new IEEE standards for data center bridging (DCB) have been proposed, offering enhanced Ethernet flow control capabilities. For example, the IEEE 802.1Qbb project authorization request (PAR) provides priority-based flow control (PFC) as an enhancement to the pause mechanism described above. PFC creates eight separate virtual links on a given physical link and allows the receiver to issue commands that pause and restart the virtual links independently. PFC thus enables the operator to implement differentiated quality of service (QoS) policies for the eight virtual links.
Due to delays in receiving and acting on flow-control messages at the transmitter, the receiving switch will continue receiving frames from the transmitter for a certain amount of time even after sending the XOFF (PAUSE) message. In view of this delay, the switch typically reserves an additional buffer to admit the packets that may arrive after the flow-control message is sent. This reserved buffer is referred to as the lossless headroom, or, simply, headroom.
It is possible for multiple ports to share headroom space in a switch buffer. For example, U.S. Patent Application Publication 2013/0250757 describes mechanisms to reduce headroom size while minimizing dropped packets by using a shared headroom space between all ports, and providing a randomized delay in transmitting a flow-control message.
As another example, U.S. Patent Application Publication 2013/0250762 describes a method for achieving lossless behavior for multiple ports sharing a buffer pool. Packets are “colored” and stored in a shared packet buffer without assigning fixed page allocations per port. The packet buffer is divided into three areas: an unrestricted area, an enforced area, and a headroom area. Regardless of the fullness level, when a packet is received it will be stored in the packet buffer. If the fullness level is in the unrestricted area, no flow-control messages are generated. If the fullness level is in the enforced region, a probabilistic flow-control generation process is used to determine whether a flow-control messages will be generated. If the fullness level is in the headroom area, flow-control is automatically generated.