The present invention relates to a system for controlling communication between parallel computers in which a plurality of computer nodes is connected through a plurality of channels and more particularly to a system for controlling communication between parallel computers in which a plurality of computer nodes is connected in a 2-divisional lattice and a plurality of nodes exists in two division channels. One such communication control system is a packet system. This system is used, for example, to transmit a message from a node A to a node C through a node B. Node A transmits a packet to node B. Node B stores this packet and then transmits it to node C. Thus, a relay transmission is performed. In this system, called store and forward routing, the packet is stored in node B once, thus causing a large delay in transmission to node C.
A system for communicating between parallel computers, using a "wormhole routing" method, is known. In wormhole routing, the message is divided into minimum transfer units called flits, for example, several bytes of data. The first flit, namely, header flit is transferred within the network through a relay route between a transmitting node and a receiving node. When a certain node receives the header flit of the message, a channel (communication path) forming a relay route is selected by a transfer destination node designated by the header. The header flit and the following data flits are then transferred to the receiving node through the channel. The message is transferred in a form such that it continuously occupies the relay route from the transmitting node to the receiving node. That is, the message is transmitted on a channel in a chaining manner. Before the last flit of the message is output from the transmitting node, the header flit sometimes arrives at the receiving cell.
A flit other than the header flit does not contain routing information but a flit of the message is transferred on a continuous channel within the network, and is thus not interleaved by flits of other messages. When the header flit of the message is blocked, transfer of all the flits of the message is stopped and transfer of other messages requiring the same channel being used for the message transfer is also blocked.
When wormhole routing is used in a network in which a plurality of nodes is connected in a torus manner, it is necessary to avoid a deadlock state in which transfer of the all messages is blocked.
FIG. 1A is an explanatory view of a deadlock in a network connected in a torus manner. In this figure, the network is formed by four nodes 1, 2, 3 and 4]with four uni-directional channels (a) (b) (c) and (d) connecting them.
In FIG. 1A, when all 4 nodes forming the network start to perform clockwise data transfer by a wormhole routing method, the message from node 1 is transferred using channel (a), the message from node 2 is transferred using channel (b), the message from node 3 is transferred using channel (c) and the message from node 4 is transferred using channel (d). However, when the following flit of the message is transferred at the next clock, the message from node 1 cannot use channel (b) and is blocked because channel (b) is already being used for the transmission of the message from node 2. The following flit of the message transmitted from node 2 similarly cannot use channel (c) and is blocked. The message transmitted from nodes 3 and 4 are blocked in the same way and since all messages are blocked, the result is a deadlock state. This deadlock state is caused, for example, by the length of the data being greater than the length between two channels. In such a case, when data is transmitted from node 1 to node 3 the last part of data to be transmitted from node 2 to node 4 has the last portion remaining in channel (b). Thus, the data transmitted from node 1 to 3 node cannot enter channel (b). Therefore, transfer of data from node 1 to node 3 and from node 2 to node 4 is blocked.
A virtual channel method is known to be an algorithm for avoiding such a deadlocked state. A communication system using this method is explained by referring to FIG. 1B, in which a channel connecting two nodes is uni-directional and all nodes are virtually doubled. The channel connecting nodes 1 and 2 is uni-directional from node 1 to node 2 and is formed of double channels comprising (a) and (a) This double channel is originally virtual and comprises one channel in terms of hardware. However, two channels may be naturally provided in terms of hardware. Respective channels imaginarily doubled can store a flit of a message, one of the flits stored in channel (a) or (a) being transferred from node 1 to node 2. The channel through which respective flits are to be transferred is determined for every flit.
A method of using channels which are doubled upon transmitting a message from respective nodes is as follows. ##EQU1## By doubling a channel, the deadlock state in which all messages are blocked can be avoided.
Even in a system using a virtual channel, as shown in FIG. 1B, there is a problem in that the transfer ability is lowered when a message is transferred clockwise through all the nodes. At the first block, node 1 transmits the message through channel (a); next, node 2 transmits the message through channel (b); then, node 3 transmits the message through channel (c); and finally, node 4 transmits the message through channel (d). Thus, all four channels are used. However, at the second clock, the message is transferred through only two channels. The message transmitted from node 1 is also transmitted to node 3 via channel (b), and the message transmitted from node 4 is kept waiting until the transfer of the message from node 1 to node 3 is completed and then is also transmitted to node 2 via channel (a). Similarly, the message transmitted from node 3 is kept waiting until the transfer of the message from node 4 to node 2 is completed and then is transferred to node 1 via channel (d). The message transmitted from node 2 is kept waiting until the transfer of the message from node 3 to node 1 is completed and then transferred to node 4 via channel (c). This is because the length of the message is larger than the distance between two nodes. Thus, when the message is transmitted from node 1 to node 3 via node 2, the last part of the message remains in channel (a). Further, the channels which can be used upon transmitting a message between the nodes have been predetermined. For example, in message M2, it is predetermined that channel (C) is not used but channel (C) is used upon transferring the message from node 3 to node 4. Thus, start of transfer of message M2 from node 3 to node 4 is kept waiting until message M3 completes the use of channel (C).
All messages are transferred as described above. The message is transferred via two channels for all clocks except the starting clock. As a result, the transfer capability is lowered even in a vertical channel.