Signal processing handles a large amount of continuous data (stream data) as operation target data. In many cases, the signal processing repeatedly executes the same processing (an operation realized using a plurality of commands) for the large amount of data.
As one processor architecture for efficiently processing a large amount of data, an array processor is available.
FIG. 7 is a block diagram illustrating an example of a configuration of a general array processor 19.
The array processor 19 illustrated in FIG. 7 includes an array operation unit 14 in which a plurality of operation processing units 11a to 11d (hereinafter, collectively referred to as an operation processing unit 11) and a plurality of operation processing units 12a to 12d (hereinafter, collectively referred to as an operation processing unit 12) are disposed in an arrayed manner (in the following description, the respective operation processing units, that is, the operation processing unit 11 and the operation processing unit 12, included in the array operation unit 14, will be collectively referred to as an operation processing unit 15).
Further, the array processor 19 includes a data memory 17 including memory banks 10a to 10d (hereinafter, collectively referred to as a memory bank 10) and memory banks 13a to 13d (hereinafter, collectively referred to as a memory bank 13) connected with the array operation unit 14 (in the following description, the memory bank 10 and the memory bank 13 will be collectively referred to as a multi-bank 16).
The operation processing unit 15 disposed in an arrayed manner is connected with a neighboring operation processing unit 15. Between the operation processing units 15, wiring is connected in a meshed manner. Each connection is controlled by a switch disposed on an input stage of the operation processing unit 15. Operation target data is stored in the memory bank 10 and the memory bank 13 of the multi-bank 16 connected with the array operation unit 14.
The array operation unit 14 differs in parallelism and a degree of flexibility toward operation processing, depending on the internal connection configuration. In the array processor 19 illustrated in FIG. 7, the operation processing unit 15 is connected with a neighboring operation processing unit 15 as illustrated in FIG. 7.
The array processor 19 executes predetermined processing by causing a plurality of operation processing units 15 to be cooperative. Therefore, when timings of data input after reaching the operation processing unit 15 are different, it is difficult for the array processor 19 to appropriately execute an operation in some cases.
In FIG. 7, for example, description is made with attention to the operation processing unit 12b. It is assumed that an operation of the operation processing unit 12b needs operation results of the operation processing unit 11b and the operation processing unit 12a. In this case, when execution timings of operations of the operation processing unit 11b and the operation processing unit 12a are different, input timings of the respective operation results to the operation processing unit 12b from the operation processing unit 11b and the operation processing unit 12a become different based on the timings. In the same manner, when delays of operation outputs of the operation processing unit 11b and the operation processing unit 12a are different, input timings of the respective operation results to the operation processing unit 12b from the operation processing unit 11b and the operation processing unit 12a become different based on the delay difference. A time equivalent to this timing difference becomes a wasted time.
Concurrent operations of as many operation processing units 15 as possible during operation processing of the array processor 19 become a point for enhancing operation efficiency of the array processor 19. Therefore, a way how synchronization control between the operation processing units 15 is configured is important for enhancing operation efficiency of the array processor 19.
Therefore, a synchronization mechanism for data is being used (refer to, for example, PTL 1).
FIG. 8 is a block diagram illustrating an example of a configuration of an array processor 29 using a FIFO (First In First Out) memory as the synchronization mechanism.
As illustrated in FIG. 8, the array processor 29 includes operation processing units 21a to 21d (hereinafter, collectively referred to as an operation processing unit 21) and operation processing units 22a to 22d (hereinafter, collectively referred to as an operation processing unit 22). Further, the array processor 29 includes memory banks 20a to 20d (hereinafter, collectively referred to as a memory bank 20) and memory banks 23a to 23d (hereinafter, collectively referred to as a memory bank 23). Furthermore, the array processor 29 connects inputs/outputs of the operation processing unit 21, the operation processing unit 22, the memory bank 20, and the memory bank 23 via FIFOs 24a to 24g, FIFOs 25a to 25g, and FIFOs 26a to 26d. Hereinafter, the FIFOs 24a to 24g will be collectively referred to as a FIFO 24. In the same manner, the FIFOs 25a to 25g will be collectively referred to as a FIFO 25. The FIFOs 26a to 26d will be collectively referred to as a FIFO 26. The array processor 29 includes synchronization control units 27a to 27d (hereinafter, collectively referred to as a synchronization control unit 27) and synchronization control units 28a to 28d (hereinafter, collectively referred to as a synchronization control unit 28), for synchronizing input/output data. The synchronization control unit 27 and the synchronization control unit 28 control data reaching the memory bank 20, the memory bank 23, the operation processing unit 21, and the operation processing unit 22, by using the FIFO 24, the FIFO 25, and the FIFO 26. The operation processing unit 21 and the operation processing unit 22 enables synchronization control in case that reaching input data is different with respect to each input port, by using the synchronization control unit 27 and the synchronization control unit 28.
Further, as another solving method, there is a technique for introducing an asynchronization control unit for connecting an operation processing unit with a neighboring operation processing unit (refer to, for example, PTL 2). In the technique described in PTL 2, when input data for predetermined operation processing is insufficient, each operation processing unit waits for execution of an operation.