The present invention relates to a system of conducting an inter-nodal data transfer and an inter-nodal data transfer apparatus in a multi-node computer system to which a plurality of nodes having inter-nodal control devices are connected via an inter-nodal crossbar switch.
An inter-nodal data transfer system with respect to the present invention is used especially for a multi-node computer (a distributed memory type parallel computer) system in which a plurality of nodes configured of a plurality of processors and common memories are connected to conduct a data transfer between the nodes.
Such multi-node computer system generally has a plurality of processors for improving a command processing capacity within the nodes, and by executing commands in parallel in each processor, the capacity for the nodes is improved, and however, in association with the improvement in recent years of the command processing capacity within the nodes, a high inter-nodal data transfer capacity corresponding to the capacity within the nodes is required for improving a capacity of the system as a whole.
To response to this requirement, as disclosed in JP-A267927/2000 for example, in a data transfer between the nodes, it is proposed to obtain a high data transfer capacity by expanding data width and increasing a data content which can be transferred one time.
In the technique disclosed in this prior art, as shown in FIG. 2 attached to the present application, only one inter-nodal control device (referred to as an RCU, hereinafter) for controlling a connection state between the nodes. Accordingly, in an operation for an inter-nodal data transfer, a data content which can be transferred one time is increased by expanding data width, and a time period required for one command for instructing a transfer is shorten, and thereby, a capacity is improved.
As a result, in conducting a two-distance transfer, it is necessary to provide a circuit for conducting an order assurance between data of 8 bytes within data defined by the transfer, and there is a defect that HW (hardware) is increased in quantity.
Also, since transfer data width is determined to some extent, it is difficult to change an inter-nodal transfer capacity flexibly in accordance with a system configuration, and there is also a task that useless HW becomes necessary in a case of accommodating itself to a system in which the number of processors is reduced and a capacity within nodes is reduced or to a system in which the number of processors is increased and a capacity within nodes is gained.
Furthermore, since the transfer capacity is improved due to the expansion of the data width, the transfer is not efficiently conducted in a case where a data, a content of which is smaller than the data width, is transferred and in a case where a fraction occurs in the data width, and there is also a task that a full transfer capacity cannot be exhibited.
There is JP-A-112912/2000 that discloses a similar prior art. A processing system of a distributed memory type parallel computer is also disclosed here, in which one RCU is provided for each node, and test and copy of a data or the like to a remote memory are performed at a high speed. According to this prior art, a command or the like and a copied data are transferred rapidly, and a retry-sequence of a test command is improved, and as a result, test and copy processing of the distributed memory type parallel computer is conducted at a high speed. However, since the RCU for each node is one, the high speed processing involuntarily has a limitation.
Further, according to JP-A-51966/2001, a node apparatus, a parallel processing system and a parallel processing system for dividing one job into a plurality of processes and processing them by means of a plurality of node devices, and a storage medium in which a parallel processing program is recorded are disclosed. A technical idea disclosed here relates to an apparatus, a system and other means for dividing one job into a plurality of processes and processing them in parallel by means of a plurality of node devices while making a complicated synchronous mechanism and a specific communication mechanism unnecessary. Each node device in this case is configured of a communication control unit, a CPU that is a plurality of processors and a main storage device. The communication control unit has a barrier synchronous mechanism within a node for detecting that barrier synchronization is established within the node device itself based on a synchronizing request from each CPU, and notifying the whole node devices for executing parallel processing of information representing this state of the establishment via a communication cable, and an inter-nodal barrier synchronous mechanism for detecting that the parallel processing is completed based on the information of the establishment of the barrier synchronization within other node devices, which is notified by other whole node devices for executing the parallel processing via the communication cable. The communication control unit in such an arrangement is configured as one system.