Packet blocking and flow interference in packet-switched interconnects lead to congestion and saturation trees that could cause performance collapse. Non-interfering architectures with independently flowing data flows were practically approximated by static-a priori definition and reservation of end-to-end resources, e.g. links, virtual channels/lanes, buffers, queues, that are allocated to the data flows. Such approaches are effective, although heavy in overhead and limited in scalability.
Other approaches, such as the asynchronous transfer mode (ATM) and IP can prevent saturation trees by sacrificing losslessness. The general method to attain a scalable and stable network architecture as used in TCP/IP and ATM networks builds on end-to-end flow control, window- or rate-based, respectively. The main detractor here is convergence speed because of long delays. Whereas a reaction time of milliseconds is adequate for large/slow networks, server and storage interconnection networks require microsecond solutions or faster, to prevent saturation trees and catastrophic performance degradation. Thus, this method is more appropriate for long-lived (static) congestion than for short-lived (dynamic) congestion management. In such an environment, congestion leads to excessive loss (drop) rates.
In non-provisioned interconnection networks (SAN, StAN, HPC etc.), congestion control is considered as one of the difficult challenges. Non-interfering architectures are described by G. F. Pfister and V. A. Norton, “Hot Spot Contention and Combining in Multistage Interconnection Networks”, IEEE Trans. on Computers, Vo. C-34, No. 10, October 1985, pp. 933-938; or by W. Dally, “Virtual-Channel Flow Control”, IEEE Trans. on Parallel and Distributes Systems, Vol. 3, No. 2, March 1992, pp. 194-205.
Dynamic non-interference via reactive flow and congestion control remains an open issue of increased interest for supercomputer, server and storage interconnection networks. Reactive flow and congestion control is a hard space-time problem, because an average network with (tens of) thousands of nodes should resolve contention between many flows sharing the interconnection network's resources. The issue is how to disseminate accurate and timely status information to all traffic participants, i.e. a large address-space identifying flows and their resource allocations should be communicated with low latency—globally—or if possible, on a need-to-know basis.
U.S. Pat. No. 5,768,258 describes a selective congestion control mechanism for information networks to mitigate the loss rate. The congestion control mechanism is especially used for ATM networks supporting data services or other non-reserved bandwidth traffic. The control mechanism reacts upon detection of a traffic bottleneck by selectively and temporarily holding back the data traffic that is to travel via the bottleneck. A congested node transmits congestion notifications containing one routing label information per flow and deferment information to upstream nodes, thus enabling a selective temporary backpressure action. For detecting a congestion, the buffer occupancy of an output port of a node is monitored and if the occupancy exceeds a given threshold, congestion is detected. A communication and switch-based ATM network is connection-oriented and all ATM cells belonging to a connection follow the same path by swapping the routing labels at the input port of each switch. Thus, the actual routing decisions take place only during connection and set-up and routing is not considered as a critical issue in the ATM environment. Upstream switching nodes are informed on a hop-by-hop basis about the traffic flows that should be back-pressured to attenuate the congestion. The congestion notification comprises the information that selected cells that flow via the bottleneck link have to be held back for a duration of time. In fact, this induces saturation trees.
In the known congestion controlling methods a tree of upstream nodes is blocked if a congestion globalizes. There is no differentiation between data packets that cause the congestion (culprits) and data packets that are only victims of the congestion if “culprits” and “victims” share the same buffer. With VPI/VCI labelling, only one label can be used per flow, i.e. the selectivity is fixed.
In addition to the prior art, it is a general object of this invention to provide a method to dynamically counteract against saturation trees in a lossless packet-switched multistage interconnection network. It is a further object of the invention to rapidly attenuate dynamic congestion in interconnection networks (SAN, clusters, supercomputers) by providing on-demand resource non-interference. It is a further object of the invention to provide a scheme that counteracts saturation trees, prevents buffer overflows, and underflows and that enables more efficient use of the switching capacity of a switching network. Whereas the prior art also performs a selective form of backpressure with fixed granularity, it is an object of the invention to adapt the granularity of the selection to reduce the congestion signalling overhead. Efficiency is better with variable granularity.