1. Field of the Invention
The present invention relates to a parallel computer constructed of a plurality of processor elements.
2. Description of the Related Art
As one prior art parallel computer arranged by a plurality of processor elements, a first type of parallel computer is known such that each of these processor elements includes a local memory for storing a program executed therein and also data, and the respective processor elements can access to local memories of other processor elements, if required.
In such a sort of parallel computer, when one processor element transfers data to a local memory of the other processor element, after the processor element for transferring the data writes the data into the local memory of the data receiving processor element, this data transferring processor element interrupts the data receiving processor element in order to assure the reference order of the data.
A reference is made, as the relevant prior art parallel computer, to, for instance, "IEEE, PROCEEDINGS OF THE 1985 INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING", pages 782 to 788.
On the other hand, as a second type of the conventional parallel computer, it has been known such a parallel computer that is arranged by a plurality of processor elements coupled with a common memory. In such a conventional parallel computer, a tag representing whether the data are valid (i.e., have been written) or invalid (i.e., have not yet been written) is applied to each word of the common memory. The data communication is performed between the relevant processor elements by utilizing this tag. In other words, when data is transferred from one processor element to the other processor element, the processor element of the data transfer side writes the data into the common memory and then the tag for this written data is changed into the condition "the data has been written".
The processor element of the data receiving side checks whether or not the tag employed for the memory position thereof indicates that "the data is valid" in order to judge whether or not the data to be read is present in the common memory. If the checked tag indicates that "the data is valid", this data is read out and thereafter the above-described tag is changed at the proper timing into "the invalid data condition". As a result, the processor element of the data transfer side can send the data to the processor element of the data reception side without an interruption for the latter-mentioned processor element. Such a sort of parallel computer is described in, for instance, "REAL-TIME SIGNAL PROCESSING IV, VOL 298" August 1981, pages 241 to 248.
In the above-described first type of the conventional parallel computers, since the data calculation is carried out by utilizing the local memory by the respective processor elements, there is very little restriction on the arrangement of the parallel computers when the number of the processor elements is increased. It is therefore relatively easy to increase the number of the processor elements or constructing such a sort of the parallel computers.
However, when the data is transferred from one processor element to the other processor element, the above-described interruption process operation is required at the data receiving processor element.
Such an overhead operation may considerably lower the overall performance of the parallel computers.
In the second type of the conventional parallel computers, on the other hand, there are the following drawbacks. That is, although the above-described overhead operation such as the interruption process operation is not required, the common memory is accessed by all of the processor elements, so that there is a great delay in the access time because the memory accessing operations for all of the processor elements compete with each other. As a result, due to this delay access time, it is difficult to employ a large quantity of processor elements in such a parallel computer. As a consequence, the second type of the parallel computer having the high-speed performance can be hardly realized.
As previously described, there are problems in the above-described first and second types of the conventional parallel computers. To solve these conventional problems, a third type of a parallel computer has been proposed by the Applicants, which is disclosed in the copending Japanese patent applications Nos. 61-182361 (filed on Aug. 1, 1986), and 62056507 (filed on Mar. 13, 1987), or the corresponding U.S. patent application Ser. No. 78656 (filed on Jul. 28, 1987) now abandoned, or the corresponding EPC patent application No. 87111124.1 (filed on Jul. 31, 1987).
In the third type of the parallel computer, no common memory causing the drawback of the second type of the parallel computer is employed, and the local memories same as those of the first type of the parallel computer are employed for the respective processor elements. However, the data reception buffers different from the local memories are additionally required for the respective processor elements in order to prevent the interruption process operation during the data communication.
In this third type of the parallel computer, when the data is transferred, the processor element of the data transfer side simultaneously transfers both the data and the identifier for identifying the data to the other processor element of the data reception side. The transferred data is stored in the reception buffer of the processor element of the data reception side. In the processor element of the data reception side, the reception buffer is associative-retrieved based upon the identifier which has been previously determined for the data in question when this data is required to be accessed, and the data in question is read out therefrom if this data is present therein. As a result, the above-described interruption process operation conducted into the first type of the conventional parallel computer as the major drawback thereof is no longer performed.
To furthermore improve the performance of the third type of the parallel computer having the satisfactory performance, the Applicants have filed another Japanese patent application No. 62-12359 on Jan. 23, 1987 and the corresponding U.S. patent application Ser. No. 145614 on Jan. 19, 1988, now abandoned, relating to a fourth type of a parallel computer.
In accordance with this fourth type of the parallel computer, there are the following particular advantages. That is, in the above-described third type of the parallel computer, the reception processor must be programmed that when plural pieces of the data are required, each of the data must be read out from the reception associative memory in a predetermined sequence for the respective data. In general, however, since the plural pieces of the data are transferred from a plurality of transmission processor, there is no clear discrimination in the data reception order by which these pieces of the data have been received at the reception associative memory. As a result of, for instance, the calculation by other processors, four pieces of data, i.e., A, B, C, and D are received. When the data representative of the maximum value among these four pieces of reception data is retrieved by the reception processor, if the maximum value retrieval program by the reception processor is so designed that the data of A, B, C and D are successively received from the reception associative memory in this order, and a comparison is made between the newly received data and the previously received data so as to find out the maximum value, the reception processor cannot proceed with the above-described maximum value retrieval program even if the data of B, C, or D has been received by the reception buffer prior to the reception of the data of A. As a consequence, it is desired to realize the fourth type of the parallel computer with solving the above-described drawbacks.
In accordance with the above-described fourth type of the parallel computer, as the identifiers for one group of the data to be processed as a whole, both the main identifier commonly used for the above data group and the sub-identifiers specific to the respective data are determined; when the data is transferred from one processor element to the other processor element, this identifier is attached to this data, and in the reception processor element, the main identifier is designated and the data having this designated main identifier is read out from the reception associative memory. As a consequence, even when any one of the data belonging to the same data group, the maximum value retrieval process can be performed without receiving other data.
In the above-described third type of the parallel computer, since the associative memory is employed as the reception buffer and the associative memory is of the specific construction, it is generally difficult to obtain a large memory capacity of this associative memory and the total cost of such an associative memory becomes high. Consequently, there is a problem in cost if the reception buffer having the large memory capacity is arranged by the associative memory. In addition, when the number of the processor element is required to be increased, the resultant cost required for employing the associative memory is increased.