The invention relates generally to the art of packet-switching systems and more specifically to a method and apparatus for implementing backpressure in a packet-switched network, such as an ATM network.
There is an evolutionary trend in the design of large capacity switching systems to move complexity away from the switching fabric, or core, towards the periphery of such systems. The periphery typically includes interfaces for physical links over which cells, or packets, of data are carried to and from the switch, and functionality for adapting and conforming the data to the requirements of particular communications network technology, such as ATM.
The switch fabric may be implemented as a conventional cell space switch whereby cells are received at any of N input ports and switched to any of N output ports. The design of such fabrics are rather simple, and typically include buffering at each output port. Buffering within the fabric may not be adequate, depending on demands from periphery subsystems, resulting in loss of data cells at the fabric.
Backpressure is a well known feedback technique to reduce or prevent cell loss when the switch fabric experiences congestion. The fabric sends a backpressure indication (feedback message) for a congested output port to the peripheral subsystem, which in response holds back sending cells destined to that port. Conventional backpressure works well with point-to-point cell traffic, but does not work well with point-to-multipoint traffic, see for example Backpressure in Shared-Memory-Based ATM Switches under Multiplexed Bursty Sources by Fabio M. Chiussi et al, 0743-166X/96 IEEE.
ATM has fast been accepted as the new generation of transport mechanism for carrying multiple medium data which require different Quality of Service (QoS). As such, traffic sources with real time and non-real time requirements can be transported using the same telecommunication infrastructure. One QoS guarantee by ATM is related to the amount of transient delay through a node. Of particular interest is the non-real time traffic type which typically can tolerate only very low cell loss rate but has no stringent delay requirements. For example, e-mail, Internet access and file transfer applications would fall under this category. It is envisaged that such applications would be highly popular and will be one of the key driving forces behind the development of ATM.
For an ATM switch to provide multiple QoS to the various traffic types (commonly referred to as service categories in the context of ATM Traffic Management Specification Version 40., at trade-mark-0056.00 April 1996 available at http://www.atinform.com/atinform/specs), it must provide advance traffic management features such that the different QoS guarantees are met. To accommodate non-real time traffic sources, which are typically mapped into nrt-VBR (non real time Variable Bit Rate), ABR (Available Bit Rate) or UBR (Unspecified Bit Rate) service category, sufficiently large buffers in the switch are required to guarantee the low cell loss ratio requirement. This is especially true in the case of very bursty non-real time traffic.
Combining the above requirements to provide advanced traffic management features with large buffers for low cell loss rate and the requirements for a highly scaleable ATM switch, it is evident that the concept of xe2x80x9cbackpressurexe2x80x9d is very attractive in achieving these goals. For example, in a typical Nxc3x97N switching architecture, the use of backpressure would allow congestion in the switching fabric to xe2x80x9cpush-backxe2x80x9d to the input buffer. When designed properly, one can achieve lossless-ness through the switching fabric. This push-back action allows queuing to be done at each input queue in the peripheral subsystem. The peripherical subsystem is typically of lower speed and it allows for ease of implementation with respect to these advanced traffic management features that provide nodal QoS guarantee. Obviously, the concept of back-pressure is only applicable to non real time traffic types as it is a means to allow for a larger buffer. These larger buffers decrease the probability of cell loss but inevitably increase cell transfer delay through the switch and are therefore not suitable for real-time traffic.
The use of backpressure also means that at the input queuing point, unicast, or point-to-point, connections (i.e. connections that are destined to one and only one output port) must be queued in a per-output manner (i.e. separate queues for each output port at each input queuing point). This is to alleviate the problem of Head-of-Line (HOL) blocking in which the cell at the head of the queue is destined to an output port that is in backpressure mode and hence xe2x80x9cblockingxe2x80x9d all the cells that are queued up behind it. By queuing at each input queuing point using a per-output-port queue model, each of these queues can react to the corresponding backpressure indication and be stopped (i.e. backpressure without HOL blocking) accordingly.
However, multicast operation (i.e.: connections that are sourced at a single point and are destined to more than one output port, a single source to many destinations model) within a backpressure switch is problematic. It is problematic in that each multicast connection is being xe2x80x9ccopiedxe2x80x9d (i.e.: multicast typically occurs in the switching fabric) by the Nxc3x97N fabric and each destination output port queue can be in a different state of backpressure. One must therefore determine how to queue up this multicast traffic at the input peripheral subsystem (input queuing point) and how to serve these cells while still maintaining the cell lossless-ness through the switching fabric.
One existing solution is not to provide cell lossless-ness through the switching fabric. Backpressure is not used with non-real time multicast traffic. However, low cell loss rate often requires much larger buffers dedicated to multicast traffic at the switching fabric. This is very costly and inefficient.
An alternative existing solution is to queue all multicast connections together at the input queuing point in a single queue and ignore the backpressure indication. (i.e., Fire-at-will). This will jeopardize the lossless-ness features of backpressure. This also has serious fairness problems as the multicast connection takes advantage of the unicast connections as they properly react to the backpressure indication.
A further alternative is HOL blocking. All multicast connections are queued together at the input queuing point in a single queue. Instead of ignoring the backpressure indication, the queue only sends a multicast cell from this queue when there is no backpressure indication at all from all switch output port queues.
A slight improvement, that still does not totally eliminate HOL blocking, is to examine the destinations of the cell at the head of the input queuing point. When all these destinations are not in backpressure, then the cell is transmitted. Meanwhile, there could be cells that follow into the input queuing point which are designated to non-backpressured switch output queues and hence HOL blocking still results from this situation. When the blocking situation is severe enough, the queue eventually overflows and cells are lost.
Broadly speaking, the invention provides methods and apparatuses for applying backpressure in a packet-switch, such as an ATM network.
In a first aspect the invention provides a method of relieving congestion in a packet switch. The method sends cells to output ports of a switch core in accordance with a destination address specified for each cell. It monitors for congestion at each output port, and when congestion is detected at an output port and cells are received at an input port of the switch destined for multiple destination addresses including the congested output port, modifies the multiple destination addresses to remove the destination address of the congested output port. The method continues to send the modified cell to the multiple destination addresses other than the congested output port.
The method may further employ the step of, prior to receiving cells at input ports of the switch that are destined for multiple destination addresses, identifying a primary route at the option of a user for such cells and if the congested output port is on the primary route then not modifying the multiple destination addresses for the cells to remove the destination address of the congested output port.
For all cells: (i) received at the input port; (ii) destined for multiple destination addresses including the congested output port; and (iii) forming part of a multiple cell packet where one of the multiple destination addresses of one of the cells in the packet has been modified to remove the destination address of the congested port; the method may further discard those remaining cells in the packet received at the input port and destined for the congested output port whether or not the congested output port continues to be congested, until receiving the cell containing an end of packet boundary.
The method may further not modify the multiple destination addresses as described previously if less than a given number of cells have been queued at the input port and not already sent from the input port to the switch core. This step is not combined with identification of primary route as described above.
This step may be optionally combined with partial packet discard as described above.
In a further aspect the invention provides apparatuses with means for carrying out all elements of the methods described above.