1. Field of the Invention
The present invention relates to a parallel processing circuit. More specifically, the present invention relates to a data processing apparatus that can realize an efficient transfer of data between processing modules. The present invention further relates to a data processing method and a relevant software program.
2. Description of the Related Art
There is a conventional method applicable to a plurality of processing modules linearly connected via a ring bus. The method includes allocating a part of pipeline processing to each processing module and causing respective processing modules to perform parallel processing, thereby speedily accomplishing the pipeline processing.
According to a conventional method discussed in Japanese Patent No. 2522952, a ring bus including a first-in first-out (FIFO) memory capable of changing the number of stages is provided and, if any conflict occurs between transmission data of each processing module and data transferred via the ring bus, a vacant slot is generated by increasing the number of stages of the FIFO memory so as to prevent any deterioration in system performances that may be caused by the conflict.
Further, according to a conventional method discussed in Japanese Patent No. 3083582, a plurality of layers are provided so that each layer had a structure including a plurality of processing elements (PEs) connected in a ring shape to perform pipeline processing. Packets are transferred between two ring buses by connecting PEs of different layers without using any packet control apparatus.
According to the pipeline type processing module system, if an amount of data transferred from any preceding processing module to a post-stage processing module exceeds a processing capacity of the post-stage processing module, the unprocessed transferred data places a stress on a communication band between the preceding processing module and the post-stage processing module. The stress placed on the communication band reduces a transfer efficiency of each data and, as a result, reduces the entire processing efficiency of the parallel processing circuit.
According to the configuration including a plurality of processing modules serially connected in a ring shape, the plurality of processing modules can be controlled to perform data processing according to an order different from a physical connection order of the plurality of processing modules. The above described configuration is useful in downsizing a circuit scale because the plurality of processing modules can process data according to an arbitrary order. (This is because, in a case where the same processing is performed at two portions of the pipeline processing, a linear pipeline circuit is required to prepare a plurality of circuits to perform the same processing.)
However, according to the method discussed in Japanese Patent No. 2522952 or the method discussed in Japanese Patent No. 3083582, in a case where a plurality of pieces of pipeline processing to be processed in a different order is allocated to a plurality of processing modules, the transfer efficiency may decrease when an amount of data transferred from one of the processing modules to a post-stage module exceeds a processing capacity of the post-stage module.