1. Field
This application relates to communication networks and, more particularly, to a method for lossless behavior for multiple ports sharing a buffer pool.
2. Description of the Related Art
Data communication networks may include various computers, servers, hubs, switches, nodes, routers, other devices coupled to and configured to pass data to one another. These devices will be referred to herein as “network elements”. Data is communicated through the data communication network by passing protocol data units, such as frames, packets, cells, or segments, between the network elements by utilizing one or more communication links. A particular protocol data unit may be handled by multiple network elements and cross multiple communication links as it travels between its source and its destination over the network.
In certain data networks, there may be requirements to deploy switches that provide lossless behavior. This behavior can be either at the port level or at the flow level. In case of a port, any packet entering the switch via a port that is configured as lossless will not be dropped. In case of a flow, any packet entering the switch that is associated with a flow that has been classified as lossless will not be dropped. Other packets in other flows may be dropped, however.
For a switch to maintain lossless behavior, it must have adequate packet buffer capacity and a mechanism that it can use to send pause messages to the offending ingress ports to prevent the ingress ports from overflowing its internal packet buffer. The ingress port will transmit the pause message to cause an upstream network element from transmitting additional packets until the backlog of packets stored in the buffer can be cleared. Pause generation is typically triggered if one or more of the output ports are paused from attached downstream switches, or if multiple ingress ports are sending traffic to lesser numbers of egress ports, thus forming some form of n:1 congestion.
Where the internal packet buffer is shared by a group of ports, the manner in which the buffer pool is managed and the manner in which pause messages are generated is important. Historically each port's usage of the shared buffer pool would be tracked, such that each port received a fixed number of pages of memory in the shared buffer pool. When the amount of memory consumed by a given port reached the allocation threshold, a pause message would be transmitted on the port to instruct the upstream port to cease transmission of additional packets for a period of time. This causes inefficient use of the buffer space when not all ingress ports are active, and can cause excessive pause generation.
Excessive pause generation, in return, can cause output port rate drooping, in which the network element is not able to output packets at full capacity on the output port because of insufficient packets to be transmitted. For example, an input port may cause a pause message to be generated upon receipt of a traffic burst at a port, even if there is sufficient buffer capacity and output capacity on the switch. In addition to causing the output port rate to droop, the premature generation of pause messages may cause head-of line blocking, premature network level congestion spreading, and higher end-to-end latency. Accordingly it would be advantageous to provide a method for lossless behavior for multiple ports sharing a buffer pool.