Performances of information processing apparatuses, such as computers, continue to improve every year. However, due to limits in reducing the size of semiconductor circuits and the saturation of a curve with which the operation clock frequency of the semiconductor circuits have increased, there is a limit to improving the performance of processors, such as CPUs (Central Processing Units). For this reason, a further improvement in the performance recently relies upon parallel computing of a parallel computer using a plurality of processors.
The performance of the parallel computer not only depends on the computation speed of each processor itself, but also depends on the communication speed or the time required for the processors to communicate with each other. Because there is a limit to improving the performance of the processor itself according to the existing technology, it is necessary to improve the communication speed between the processors in order to further improve the performance of the parallel computer. The communication speed in the parallel computer may be roughly categorized into two elements, namely, a latency corresponding to a data transfer time, and a bandwidth corresponding to a bandwidth of the data transfer.
The latency is the time it takes for the data communication to start and end, and the communication speed improves as the latency becomes shorter. However, when the structure of the parallel computer is made complex in order to improve the performance of the parallel computer, the logic becomes complex and the number of transistors that are used considerably increases, to thereby generate signal delays and deteriorate the latency. In addition, the effects of the latency accumulate as the scale of the parallel computer becomes larger, and makes it more difficult to further improve the performance of the parallel computer system as a whole.
On the other hand, the bandwidth is the criterion representing the amount of data that can be transferred in one transfer. Naturally, it is desirable to transfer a large amount of data in one transfer. However, when the amount of data to be transferred in one transfer is simply increased, the number of bits to be transferred in one transfer increases. As a result, the number of transistors used for transferring and holding the data increases, to thereby increase the area of a semiconductor chip occupied by a LSI (Large Scale Integrated) circuit that forms the parallel computer. Consequently, it takes time to synchronize the data transfer when the number of bits to be transferred in one transfer increases, to thereby deteriorate the latency.
FIG. 1 is a block diagram for explaining an example of a conventional network system (or network architecture). FIG. 1 illustrates a network system 1 using a two-dimensional mesh topology, such as the two-dimensional mesh torus topology. As illustrated in FIG. 1, the network system 1 includes a crossbar switch 2, and crossbar interfaces (I/Fs) 3-1 through 3-4 that are connected to the crossbar switch 2. All data from each of the crossbar interfaces 3-1 through 3-4 is redistributed to the crossbar interfaces 3-1 through 3-4 via the crossbar switch 2.
A node 5 is connected to each of the crossbar interfaces 3-1 through 3-4. The node 5 is formed by a computing node such as a processor or, an I/O (Input and Output) node. At least one of the four nodes 5 in FIG. 1 is a computing node. The network system 1 and the four nodes 5 form an information processing apparatus. A parallel computer is formed when two or more nodes 5 are formed computing nodes.
As may be seen from FIG. 1, the latency and the bandwidth have a tradeoff relationship not only in computers, but also in network systems. It is difficult to improve both the latency and the bandwidth.
Therefore, the conventional computes and network systems suffered problems in that it is possible to improve both the latency and the bandwidth simultaneously.
The applicant is aware of Japanese Laid-Open Patent Publications No. 11-212866, No. 2002-328838, and No. 10-215266.