1. Technical Field
The present invention relates to parallel processing devices having a plurality of parallel processing units and, more particularly, to parallel processing devices that can facilitate connection among the plurality of parallel processing units.
2. Related Art
Neural networks are known as one of the applications with which parallel processing can exhibit its true performance. A neural network is a mathematical model aiming at expressing some of the characteristics that can be observed in the brain functions. With processing by a neural network, it is possible to reduce the amount of information of input data, and therefore favorable solutions may often be obtained with a relatively small amount of computation for multidimensional data and linearly inseparable problems such as image data and statistic data. For this reason, neural networks are applied in a variety of fields, such as, pattern recognition, data mining and the like.
When a large scale neural network is to be realized, the amount of its computation becomes enormous, and processing in a practical time span becomes difficult. Some of the methods for solving the problem include: (1) a method to increase the computing power of each single processor, (2) a method to use a parallel computing technique using a plurality of processors, (3) a method to implement the functions by hardware such as LSI, and the like. The aforementioned methods (1) and (2) are intended to cope with enormous amounts of computation by improving the power of the processor, and can handle a variety of neural network algorithms by changing programs.
According to the aforementioned method (1), conventionally, the computing power of single unit processors has been increased by increasing the clock frequency of single unit processors. However, in recent years, Moor's law is failing because higher speeds of clock frequencies increased calorific value, and dimensions obtainable by microprocessing are reaching the physical limits, such that it is becoming more difficult to increase the computing power of single unit processors per se. For this reason, developments on methods to increase the computing power of processors have shifted from the method (1) toward the method (2), and are focusing more on larger caches and multiple computation cores in order to promote higher speeds through increased computing powers and suppression of heat generation. The developments of processors are also being made toward operations with lower clock frequencies and thus lower power consumption.
However, the method (2) entails essential problems in that it is difficult to improve methods for effectively operating an enormous number of processors, in other words, to increase the degree of parallelism, and it is difficult to construct networks that enable communications of enormous amounts of data among multiple processors. Therefore, it is difficult to increase the efficiency in parallel processing of a large scale neural network by using the method (2), namely, a parallel computing technique that uses multiple processors.
On the other hand, according to the method (3) using hardware implementation, although there are limitations in neural network algorithms that can be realized by hardware, the method (3) can achieve performances at incommensurable processing speeds even at lower frequencies in specific applications, compared to the methods (1) and (2). Examples of related art that pertains to technologies that implement parallel processing by hardware are Japanese Laid-open Patent Applications 06-195454 (Document 1) and 2006-039790 (Document 2).
However, the technologies described in Document 1 and Document 2 concerning hardware implementation according to the method (3) entail problems of wirings, whereby the circuit scale becomes enormous due to wirings, or wirings between circuits cannot be made. For example, in a multilayer neural network, an output from any of the nodes in an output layer needs to be inputted in inputs of the entire nodes of an input layer, such that, when the number of nodes in each layer increases, the amount of wiring therefore drastically increases.
Also, for example, while neurons are disposed in a three-dimensional space and interconnected in the case of the neural network of the actual brain, components in a hardware neural network structure such as LSI are basically disposed two-dimensionally, and therefore the problem of wiring cannot be essentially solved. The problem of wiring remains even when components are disposed three-dimensionally by laminate structure or the like, and therefore its application is limited to uses where limited wirings (connections only among adjacent components) suffice for the uses.
Also, the problem of wiring among multiple processing units is not limited to neural networks, but also poses problems when it is necessary for computation to input an output of a processing unit in multiple processing units at once. Besides general neural networks such as hierarchical neural networks and self organizing maps, for example, the wiring problem has impacts on gravitational many-body problems, simulation of charged many-particle, filtering of signals and the like.