2. Field of the Invention
The field of this invention relates to data transfer through data-driven switching networks among concurrent computers ("nodes"). More particularly the field relates to improved routing algorithms for switching networks which link M-ary n-cube nodes together by communication channels between nodes, which channels terminate in the switching networks.
3. Brief Description of the Prior Art
Concurrent computing systems connected in a cube configuration are disclosed and claimed in applications assigned to the assignee of this application. For example, a concurrent computing system in which individual computers, each with a computational processor and a message-handling processor, are connected as a hypercube is described and claimed in an application entitled "Concurrent Computing Through Asynchronous Communication Channels," filed on July 12, 1985, and assigned to California Institute of Technology. The identified application is referred to herein as the Seitz et al, Cosmic Cube application. The nodes are computers comprised of processors and memory, and the nodes communicate with each other by bidirectional communication links only along the edges of the cube.
In an application assigned to the assignee hereof, entitled "Method and Apparatus for Implementing a Maximum-Likelihood Decoder in a Hypercube Network" filed on Sept. 27, 1985, having Ser. No. 781,224 by Fabrizio Pollara-Bozzola, convolutional codes are decoded by the accumulated metric and survivor steps of the Viterbi algorithm. The improved system divides the decoding operation in parallel among all processors which are assigned unique states of the trellis to compute at different stages on the trellis. In this and Seitz et al Cosmic Cube application, X, Y, and Z indicate directions of communication in the cube. Each node is assigned a different binary coded identification label, which labels are uniquely ordered in the network. A destination descriptor accompanies each block of data and that destination descriptor is modulo-two added to the local label of a receiving node in order to control message routing.
Message routing is employed in another application assigned to the assignee hereof, entitled "Concurrent Hypercube Computing System with Improved Message Passing" filed on Apr. 1, 1986, naming J. C. Peterson et al as inventors. In the latter application, separate control channels in addition to bidirectional communication channels link each adjacent node together.
As each of the above-identified applications suggest, the communication time and the amount of communication and control channel wiring are significant factors in the application of concurrent computing nodes to cube-connected systems. Deadlock-free routing is achieved in all of these applications by routing along successive dimensions.
In an application entitled "Torus Routing Chip" invented by Chares L. Seitz and William J. Dally, filed on Dec. 18, 1986, and assigned to the same assignee as this application, another deadlock free routing system invention is disclosed. Instead of reading an entire data packet into an intermediate processing node before starting transmission to the next node, the routing of this latter invention forwards each flow control unit (flit) of the packet to the next node as soon as it arrives. This so-called "wormhole" routing results in a reduced message latency when compared under the same conditions to store-and-forward routing. Another advantage of wormhole routing is that the communication does not use up the memory bandwidth of intermediate nodes, and a packet does not interact with the processor or memory of intermediate nodes along its route. Packets are moved by self-timed routing elements and remain strictly within the routing network until they reach their destination. If a routing network at a node is blocked, the packet pauses in the network and does not advance until that node's network is not busy and the packet can advance further.
In the invention of this application, a message header is sent initially to form a completed path through any number of nodes between an originating and a destination node. The completed path is a virtual circuit for pipelining of data between the originating and destination nodes.