A communication network typically comprises multiple network elements such as switches or routers interconnected with one another. The switches typically buffer incoming packets before sending the packets to a selected next-hop switch, and employ flow control measures to prevent previous-hop switches from causing buffer overflow. A deadlock condition may occur in the network, when the buffers of multiple switches having cyclic dependency become full. Deadlock conditions are likely to occur in certain network topologies such as mesh and torus topologies.
Methods for packet routing that avoid deadlock conditions are known in the art. For example, U.S. Pat. No. 9,009,648, whose disclosure is incorporated herein by reference, describes systems and methods for automatically building a deadlock free inter-communication network in a multi-core system. A high level specification is used to capture the internal dependencies of various cores, and using it along with the user specified system traffic profile to automatically detect protocol level deadlocks in the system. When all detected deadlock are resolved or no such deadlocks are present, messages in the traffic profile between various cores of the system may be automatically mapped to the interconnect channels and detect network level deadlocks. Detected deadlocks then may be avoided by re-allocation of channel resources.
U.S. Pat. No. 6,918,063, whose disclosure is incorporated herein by reference, describes a method and system for promoting fault tolerance in a multi-node computing system that provides deadlock-free message routing in the presence of node and/or link faults using only two rounds and, thus, requiring only two virtual channels to ensure deadlock freedom. A set of nodes for use in message routing is introduced, with each node in the set being used only as points along message routes, and not for sending or receiving messages.
One routing scheme for preventing deadlocks in Cartesian topologies is known as the Dimension Ordered Routing (DOR) scheme, which is described, for example, in “The architecture and programming of the Ametek series 2010 multicomputer,” published in Proceedings of the third conference on hypercube concurrent computers and applications: Architecture, software, computer systems, and general issues, Volume 1, ACM, 1988, which is incorporated herein by reference. A DOR variant for torus topology is described, for example, in “Deadlock-free message routing in multiprocessor interconnection networks,” IEEE Transactions on computers, Volume C-36, pages 547-553, May, 1987, which is incorporated here by reference.