The present invention is related to a computer for processing vector data.
In a conventional parallel computer sharing a main storage which is constituted by a plurality of vector processors, provision is made of a semaphore register that is shared by vector processors in the configuration. When a vector data written onto the main storage by a given vector processor in the configuration is to be read out by other vector processor, the semaphore register is used and the sequence of main storage references is ensured by effecting the exclusive control for the whole region where vector data of the main storage are stored. Apparatuses of this kind have been disclosed in U.S. Pat. No. 4,636,942 and S. Fernbach, "Supercomputers Class IV Systems, Hardware and Software", Elsevier Science Publishers B.V., Horth holland, 1986, pp. 69-81.
FIG. 5 illustrates how to use the abovementioned prior technology, wherein a VST instruction works to store the vector data in the main storage, a POST instruction works to finish the execution of the instruction after the main storage reference for all preceding instructions has been finished, a WAIT instruction works to finish the execution of the instruction after the execution of the POST instruction has been finished, and a VLD instruction works to load the vector data from the main storage. FIG. 5 is a time chart illustrating the operation in which two vector processors hand the vector data over via the main storage, and wherein an instruction sequence executed by a vector processor 1 (hereinafter referred to as VP1) is given by
VST
POST
and an instruction sequence executed by a vector processor 2 (hereinafter referred to as VP2) is given by
WAIT
VLD
It is now presumed that the main storage region which is used by the VST instruction executed by VP1 for storing the vector data is the same as the main storage region which is used by the VLD instruction executed by VP2 for loading the vector data, and that the arrangements of elements of vector data in this, region are in agreement with each other. When the above-mentioned prior technology is used as shown in FIG. 5, execution of VLD instruction is started by the VP2 after the execution of VST instruction by the VP1 has been completely finished.
If attention is given to the individual elements of vector data, on the other hand, the load of zero-th element of vector data on the main storage executed by the VLD instruction of VP2 needs not wait for the complete completion of the execution of VST instruction of VP1 but needs simply wait for the completion of the store of zero-th element in the VST instruction of VP1. This also holds true for the elements other than the zero-th element.
According to the above-mentioned prior art, however, no attention has been given to this fact, and the execution of the VLD instruction by VP2 is started being delayed by roughly the time for executing the VST instruction. The delay increases with the increase in the length of vector in the vector data that are to be handled.