The present invention relates to a method for preventing the deterioration of performance due to data communication between processors in a parallel computer system.
Among methods of inter-processor data communication in conventional parallel computers, there is one method wherein data sent from other processors are held temporarily in a receive buffer and a receive processor fetches the data when they are needed. As for apparatuses of this kind, an example is shown in Japanese Patent Laid-Open No. 49464/1985. There is no problem in the case when the receive buffer is constructed as FiFO and the number of a send processors is one. However, the send processor which sends data into the receive buffer is provided in a plurality of units, in general. Therefore, the data transferred to the receive buffer are held temporarily as a set of an identification code (ID) plus data, and the receive processor takes in necessary data by checking the ID. Accordingly, the receive buffer is constructed of an associative memory in some cases.
According to the above-described prior art, the receive processor searches the ID on the receive buffer (associative memory) to take in data. Therefore it is necessary to take in separate data sequentially when a plurality of data are needed. Since the data in a plurality are sent from a plurality of send processors generally on the occasion, it is impossible to discern the sequence of arrival of the data at the receive buffer (associative memory). Consequently, it happens in the prior art that the receive processor is forced to wait for the arrival of the data for a time longer than it needs intrinsically.
Let it be assumed, for instance, that the receive processor receives four data of A, B, C and D obtained as the results of computation by other processors and conducts a processing of searching the data out of them that show the maximum value. If the procedure of processings (program) on the receive processor is so prepared that data are taken in the sequence of A, B, C and D from the receive buffer (associative memory), the receive processor can not forward the operation until the data of A arrive at the receive buffer even when other data of B, C and D have already arrived thereat. It is possible to take the data of A, B, C and D in the sequence of their arrival at the receive buffer (associative memory), if IDs of these data are identical. In this case, however, it is impossible to make a distinction between them.
Moreover, data sent from one processor are checked by one receive check instruction, according to the above-stated prior art. Therefore, in such a process as described above wherein all of the arrivals of a plurality of data sent from a plurality of processors must be checked, it is necessary, in some cases, to execute the same number of receive check instructions with that of the units of processors, because the check of data sent from one processor is designed to be conducted by one receive check instruction. Let it be assumed, for instance, that, after data arranged in 8 rows and 8 columns as shown in FIG. 10 are divided in the direction of columns as shown in FIG. 11 and two columns of them are assigned to four processors respectively to be computed in parallel in the direction of columns, it turns necessary to compute the data in parallel in the direction of rows. In this case, the data are allotted to the four processors as shown in FIG. 12. If one unit of data is designed to be sent by the execution of one send instruction according to the above-stated prior art on the occasion, each processor is to execute 12 times of send instructions for three other processors and to execute at least 12 times of receive check instructions. As to the preparation of a program for checking the arrival of data by using the receive check instruction, there is a method, for instance, as described above, wherein the sequence of checking of reception is fixed and the checking of subsequent data is conducted after the arrival of data to be checked first is checked. In the case when there is a difference in the progress of processing between processors on the sending side, however, the sequence of checking is not always in accord with the sequence of arrival, and in the worst case, the arrival of the data to be checked first may be the last. Since the check of the data having arrived already is to be conducted after the data arriving last are checked, in this case, a time required for checking the arrival of all the data is prolonged by a waiting time, and also this involves an increase in the scale of hardware, because a receive buffer wherein received data are stored temporarily is required to hold all the data to be checked. If a program is prepared, to avoid these disadvantages, so that the arrivals of data be checked in their sequence by using also the above-mentioned receive check instructions, the control of the data having been checked and those not having been checked out of arriving data in a plurality must be conducted by the program. Although the checking of the arrival of data can be conducted in a desirable sequence according to this method, the processing of instructions for checking is complicated, and this makes it hard consequently to perform the processing of checking of reception at high speed.
Problems caused by the two methods of programming described above become conspicuous by degrees as the number of processors held by a parallel computer increases and as the speed of inter-processor data transfer is made high to improve the performance of the parallel computer, which results in a shortcoming such that it is impossible to make the most of the merits of the parallel computer.