(1) Field of the Invention
This invention relates to an apparatus for synchronizing parallel processing among a plurality of processors.
(2) Description of the Related Art
Parallel processing has been developed to speed up the performance of a plurality of processors. A general method which has been applied to various fields in order to achieve the parallel processing is to make several processors collaborate to proceed their processing in synchronization.
For example, in the field of image processing, each processor is assigned part of a frame constituting an image and is supposed to complete the processing for the assigned part within a display time corresponding to one frame. Hence, parallel processing among all the processors must be synchronized to perform the processing per frame.
Generally, barrier synchronization or event synchronization is used as a method for such synchronization.
According to the barrier synchronization, every processor, which has completed its processing, enters the wait state. When the number of waiting processors reaches a certain number, all the waiting processors restart their processing all at once.
According to the event synchronization, on the other hand, among all the processors, one is selected as an event-generation processor. When the event-generation processor informs the remaining processors of the generation of an event, these processors restart their processing all at once.
FIG. 1 shows a conventional barrier synchronization apparatus. Synchronizing parallel processing among a plurality of processors (or programs) is controlled by software (hereinafter referred to as sync software).
The barrier control block 100 and the queue 110, which are actually installed in a memory or a register of the apparatus, represent data controlled by the sync software and the processors (or programs).
The barrier control block 100 monitors the number of the processors and a pointer indicating the top of the queue 110.
The queue 110 consists of processor information 101-103 each representing a processor placed in the wait state. Each of the processor information 101-103 has the identification of a respective processor and a pointer indicating adjacent processor information.
The following is a series of operations to be performed by the conventional barrier synchronization apparatus shown in FIG. 1.
First of all; the sync software informs the barrier control block 100 of the number of processors whose processing are to be synchronized while clearing both waiting processors from the queue 110 and the pointer indicating the top of the queue 110. At this moment, all the processors are released from the wait state.
Every processor that has completed its processing refers to the number (q) of processor information pieces in the queue 110 and the number (n) of processors in operation. If the number q is less than n-1, the queue 110 is still available to one or more processors to produce their own processor information therein. When the number q is n-1, the queue 110 is full and the last processor is supposed to inform the sync software the completion of a synchronization.
Being informed of the completion, the sync software executes the above-mentioned initializing operation, and allows all the processors to restart their processing. Thus, synchronous parallel processing among a plurality of processors is repeated.
The above-mentioned construction and method for software-controlled barrier synchronization can be applied to software-controlled event synchronization as well.
FIG. 2 shows the construction of another conventional barrier synchronization apparatus having a logical OR unit 152, a control unit (not shown) as sync software, and a plurality of processor nodes 154 each composed of a processor unit 153 and a sync circuit 151. Each sync circuit 151 is set to "one" as the initial value, and the value "one" is changed to "zero" when the respective processor unit 153 enters the wait state.
The logical OR unit 152 ORs the outputs of all the sync circuits 151 and sends the value to all the processor units 153.
The following is a series of operations to be performed by the above-mentioned barrier synchronization apparatus.
(1) Each processor unit 153 writes a "one" as the initial value to the respective sync circuit 151.
(2) The control unit checks whether all the synchronization circuits 151 have been set to "ones". If they have, the control unit informs all the processor units 153 of the completion of an initialization. Each processor unit 153 restarts its processing.
(3) Every processor unit 153 that has entered the wait state writes a "zero" to the respective sync circuit 151, and waits for the logical OR unit 152 to output a "zero".
(4) If the logical OR unit 152 outputs a "zero", this means that parallel processing among all the processor units 153 are synchronized, so that each processor unit 153 restarts its processing.
Thus, the operations (1) and (2) for initialization are controlled by the control unit while the operations (3) and (4) for synchronization are carried out by each processor unit 153. These operations (1) through (4) are repeated for every synchronization.
However, the first-mentioned barrier synchronization apparatus has a defect that software provided to each processor unit requires a long time to check the number of waiting processor units and to produce the processor information for these processor units.
On the other hand, according to the second-mentioned barrier synchronization apparatus, the provision of the logical OR unit 152 has freed each processor node from checking the number of waiting processor nodes. However, the apparatus still demands some time for the initializing operations (1) and (2), which are controlled by software.