More particularly, the present invention relates to a method of adaptive routing for use in systems containing a plurality of parallel and distributed nodes for routing messages from an originating node to a destination node while reducing the time to establish a through-path from the originating node to the destination node or make a decision that no through-path can be then established. The apparatus of the invention comprises a separate node processor disposed in association with each node, each node processor containing computing means and associated logic for execution thereby for making switching decisions from header information associated with messages being routed thereby; and an intelligent channel connected to the node processor to be controlled thereby, each intelligent channel having two separate operating modes. One of the two modes is a path setup mode in which a function of establishing a path from an originating node to a destination node from the header information preceding data associated with the message is performed. The other of the two modes is a data transmission mode in which a function of transferring data associated with the message is performed. Each intelligent channel provides a plurality of input lines for connecting to the intelligent channels of next adjacent nodes to receive inputs therefrom and a plurality of output lines for connecting to the intelligent channels of next adjacent nodes to transmit outputs thereto. The input lines of a node are connected directly to the output lines when the intelligent channel of a node is in the data transmission mode whereby when the intelligent channel is in the data transmission mode the data passes directly through the associated node and has no processing time added thereto by the associated node.
In the preferred embodiment, each node includes an intelligent channel for performing the steps of, when the associated node is an originating node, originating and sending a header with the destination node's address embedded in the header to a selected next neighboring node's intelligent channel and reserving the path through which the header travels. When the associated node is an intermediate node the input lines thereof are connected to the output lines thereof whereby the intermediate node is placed in the data transmission mode. The foregoing process is repeated until the header reaches the destination node and when the associated node is a destination node, an acknowledgment is sent back to the source node through the intermediate nodes to establish a network connection pipeline communication path in the data transmission mode so that messages can start to be transmitted by the originating node.
In another aspect of the preferred embodiment, each intelligent channel of each node that can be an originating node, contains connectivity analysis logic which performs a minimum cycle breakdown of the possible paths between the originating node and the destination node to establish a list of nodes disposed along possible paths to be tried before attempting to establish a through-path to a destination node along the possible paths whereby exhaustive testing of all paths is not undertaken before success or failure is determined.
In still another aspect of the preferred embodiment, each intelligent channel of each node that can be an intermediate node along a through-path contains pruning logic for pruning a non-available tested path and all associated paths depending on the non-available path, from further testing during a process of finding an available path between the originating node and the destination node whereby redundant testing of paths which will result in failure is eliminated. The preferred embodiment also includes backtracking logic for not waiting at the node for a link to a busy next-adjacent further node to free up and for backtracking to a next-adjacent previous node when no next-adjacent further node is immediately non-busy.
In the field of digital computers, particularly in the area of large-scale distributed or parallel computing systems as characterized by the so-called "hypercube", a large number of such computing nodes (i.e. processors) are interconnected in a manner which allows any node to transmit a "message" to any other node. Typically, the routing of the messages from an originating node to a destination node must travel through the routing portions of a number of intermediate nodes. To accomplish this, the message is provided by the originating node with a header somehow designating the destination node or a route thereto. In a ring configuration, there is only one path extending from node to node to node in a loop. Like a multiplex scheme, as a message passes each node on the loop the node looks to see if its designation is the destination for the message. If it is, the message is pulled from the loop and processed. If not, it is merely passed on to the next node in line.
In a distributed or parallel computing system environment, on the other hand, each node has a number of incoming paths and a number of outgoing paths. The switching logic of the system must be sufficient to properly route the message along a path to its destination. To accomplish this in the prior art, a number of approaches have been suggested and, in some cases, implemented. A centralized switching network may be employed. This eliminates any "thinking" at the node level; but, can result in bottle necks and much lost time in the aggregate. In one known scheme, the sender provides a routing map to the destination node as part of the message header. This places a high burden on the message transmitting logic at each node. The switching logic at each node, on the other hand is quite simple. In another approach, the sender simply designates the destination node and each intermediate node makes the decision as to which path to switch the message onto as it moves towards its destination. Thus, the sending logic at each node is simple; but, its switching logic is more complex. The main problem with these latter two approaches in particular is delay time.
The common telephone is a well known example of message switching in a distributed environment. Virtually all the telephones in the world are interconnected by what can be considered as a vast a switching network where the switching decision is made at each node (i.e. switching station) along the path between the caller and the receiver. As in out last above-mentioned example, all the caller provides is the destination identifier (i.e. the receiver's telephone number). At each "node", the switching circuit attempts to establish a connection to a next node in the path. If a path cannot be established in a first direction, alternate directions are tried exhaustively until a connection is finally made; or, in the alternative, a failure is sensed (at which time the caller receivers a pre-recorded "sorry" message).
Such an approach is acceptable in a human calling environment; that is, it is completely acceptable for several seconds to elapse between the time the caller dials the number and the time when the intermediate switching has established a connection to the receiver and the receiver's phone "rings". A comparable approach is also generally acceptable in a small distributed computer network comprising only a few nodes. However, for ensemble multiprocessors, it is desirable to make message passing time closer to local memory access time so that processes do not have to wait for data to arrive. Thus, the latency for these machines should be as small as possible (typically a few to a few hundred microseconds for the new generation of 20 MIPS RISC microprocessors).
An ideal operating environment would be one in which the connect time between two nodes approaches the memory access time of the system. Thus, a node could access data from memory or from another node in substantially the same time. To achieve such an objective, it is imperative that the switching algorithms employed never wait on a path to free up; that is, a path is either available or non-available at the instant it is tested. Messages must never wait in a buffer at an intermediate node waiting for the path to the next node to free up (which it may never do, causing partial or total lock-up). Thus, an alternate approach to the exhaustive interconnecting, switching, and message routing techniques provided by the prior art is absolutely necessary.