1. Technical Field
The present invention relates to a VLSI circuit for parallel computing, a parallel computing system that utilizes this VLSI circuit, and a computer system that utilizes this parallel computing system.
2. Related Art
A parallel computing system that aims to increase processing speed by performing processing in parallel has been developed through decades of research. In the parallel computing system, a plurality of processing elements known as PEs (corresponding to the CPU) are used, and each PE performs processing independently and transfers data concerning the processing results to other PEs. Here, a transfer bus is necessary for the data transfer, and therefore it is not easy to construct hardware for processing that requires a large amount of data to be transferred.
For example, when calculating the numerical solution to a differential equation representing the diffusion of a substance, i.e. performing radiosity processing, it is only necessary that each PE perform processing for one spatial coordinate and transfer data to the PEs performing processing for adjacent coordinates, which includes four PEs when working in two dimensions and six PEs when working in three dimensions, and therefore a configuration can be used that does not increase the number of buses. However, for the radiosity process in image processing performed by a mobile terminal device such as used in recent years, when each PE performs processing in a small plane, there is a possibility that data will need to be transferred between all of the PEs. Therefore, sufficiently high-speed processing cannot be realized with a conventional parallel computing system.
One method for solving this problem of data transfer between the PEs includes using a parallel computing system that has a communication network for realizing dedicated data transfer, such as disclosed in Patent Document 1, for example. However, construction of such a communication network incurs a significant cost.
Furthermore, Patent Document 2 discloses a parallel computing system that adapts memories corresponding respectively to PEs arranged two-dimensionally for a bus with three or more ports, and performs broadband data transfer through the third port that has exceeded two dimensions. However, the specific method for realizing data transfer with the third port must be designed specifically for each case.
A computing system using HXNet, such as described in Non-Patent Document 1, is known as a parallel computing system in which it is possible to arrange PEs two-dimensionally and perform data transfer between desired PEs. This data transfer between desired PEs is realized by, in a case where there are m2 PEs represented as PE (i, j) (1≤i≤m and 1≤j≤m), performing data transfer in the order of PE (i, j)→PE (j, k) PE→(k, l). HXNet is a useful system with guaranteed implementation.
On the other hand, the number of PEs in HXNet is limited to m2, and it is impossible to form a larger HXNet by combining a plurality of HXnets. Therefore, there is no scalability that enables the small size to be enlarged at a later point.
Patent Document 1: Japanese Patent Application Publication No. H06-052125
Patent Document 2: Japanese Patent Application Publication No. H06-075930
Non-Patent Document 1: Kadota Hiroshi, “Massively Parallel VLSI Computers”, Kogyo Chosakai Publishing