The present invention generally relates to parallel processor systems, and more particularly to a parallel processor system which is suited for constructing a mutual connection network which carries out encoding of audio and image data and signal processing on computer graphics and the like.
With respect to the processor system, there are demands to carry out the signal processing at a high speed and economically. One method of satisfying such demands is to improve the processor performance. However, when this method is employed and the processor performance exceeds a certain threshold value, the costs of hardware and development greatly increase as compared to the improvement of the processor performance. For this reason, in order to realize an extremely high-speed processor, it is necessary to balance various tradeoffs. For example, the improvement of the silicon technology reduces the micron rule to enable reduction of the chip size and reduced power consumption, but the restrictions on signal synchronization and propagation delay become severe such that the time and effort required for the design become extremely large.
On the other hand, as another method of improving the performance of the processor system, there is a method which employs a parallel processor system structure. In the parallel processor system, two or more processors are connected in parallel. In this case, by using a plurality of processors which at least have certain speeds, it is possible to distribute the load among the plurality of usable processors. For example, when a single processor system which is formed by a single processor and a parallel processor system which is formed by a plurality of similar processors are compared, the parallel processor system cannot simply carry out twice the number of processes that can be processed by the single processor system, but the parallel processor system can process more processes in parallel per unit time as compared to the single processor system. In other words, when carrying out the same amount of processes per unit time, the parallel processor system can effectively utilize a more economical processor technology compared to the single processor system. In addition, the parallel processor system has an advantage in that the system scale can be modified by adjusting the number of processors depending on the application environment.
However, in the parallel processor system, it is necessary to realize a cooperative process among the processors by constructing a mutual connection network among the processors. The structure of this mutual connection network is important, because the performance of the entire parallel processor system becomes close to xe2x80x9c(processing performance of a unit processor)xc3x97(number of processors)xe2x80x9d or, xe2x80x9cless than or equal to the processing performance of a unit processorxe2x80x9d in a worst case, depending on this structure.
Conventionally, there are various connection formats for the mutual connection network, including a total connection type shown in FIG. 1, a parallel bus type shown in FIG. 2, a ring type shown in FIG. 3, a mesh type shown in FIG. 4, and an n-cube type shown in FIG. 5, for example. When these connection formats for the mutual connection network are categorized generally by function, each connection format is formed by processor nodes PN each having a processor and a communication link, communication paths CP for making communication, and cluster switches CS each connecting three or more communication paths CP.
Next, conditions for making a bandwidth between each two processor nodes PN become equal to W, that is, conditions for making a bandwidth of the communication path CP between the processor node PN and the cluster switch CS become equal to (Pxe2x88x921)xc3x97W, where P denotes the number of processor nodes PN, will be compared for the total connection type, the parallel bus type and the ring type connection formats shown in FIGS. 1 through 3.
First, performances of the communication paths CP of these connection formats will be compared. In the case of the total connection type connection format shown in FIG. 1, since an independent communication path CP connects between each two processor nodes PN, the bandwidth of each communication path CP becomes equal to W. In the case of the parallel bus type connection format shown in FIG. 2, because each two processor nodes PN are connected via a common communication path CP, the bandwidth of the common communication path CP becomes equal to Pxc3x97W. Further, in the case of the ring type connection format shown in FIG. 3, when the bandwidths of the communication paths CP between non-adjacent processor nodes PN are averaged, the bandwidth of each communication path CP becomes equal to (Pxe2x88x921)xc3x97W. Accordingly, the total connection type connection format is more economical than the other two, in that a low-performance communication path CP can be used.
On the other hand, with regard to the structure of the cluster switch CS is observed, only the bandwidth (Pxe2x88x921)xc3x97W needs to be controlled in the case of the total connection type connection format, but a bandwidth larger than (Pxe2x88x921)xc3x97W needs to be controlled in the case of the parallel bus type and the ring type connection formats because a communication between the non-adjacent processor nodes PN and passing through the cluster switch CS is also generated. However, when the number of communication paths CP connected to the cluster switch CS is observed, the number is three in the case of the parallel bus type and the ring type connection formats regardless of the number of processor nodes PN, while the number is P in the case of the total connection type connection format. For this reason, in the case of the total connection type connection format, the structure of the cluster switch CS becomes more complex as the number of processor nodes PN of the parallel processor system increases, and it becomes difficult to realize the structure for a large number of processor nodes PN, both costwise and technically.
Therefore, the total connection type connection format is superior when the number of processor nodes PN of the parallel processor system is small, but the structures of the parallel bus type and the ring type connection formats become more advantageous technically as the number of processor nodes PN increases.
In the case of the parallel bus type and the ring type connection formats, however, the distance between the processor nodes PN, that is, the number of passing cluster switches CS, becomes a problem as the number of processor nodes PN of the parallel processor system increases. In the case of the total connection type connection format, the number of passing cluster switches CS is two regardless of the number of processor nodes PN. But in the case of the parallel bus type connection format, the number of passing cluster switches CS is equal to the number of P of processor nodes PN for a maximum path. Further, in the case of the ring type connection format, the number of passing cluster switches CS is equal to INT(P/2+1) for a maximum path, where INT denotes an integer value obtained by ignoring fractions. Moreover, although a communication delay may be estimated to be a fixed delay in the case of the total connection type connection format, the communication delay is not fixed in the case of the parallel bus type and the ring type connection formats and this communication delay may greatly affect the performance of the entire parallel processor system.
Hence, as the number of processors of the parallel processor system increases, it is not always possible to construct an efficient mutual connection network using the total connection type, the parallel bus type or the ring type connection format.
The mesh type connection format shown in FIG. 4 and the n-cube type connection format shown in FIG. 5 have been proposed to solve the above described problems. In the case of the mesh type and the n-cube type connection formats, the connections respectively are two-dimensional and three-dimensional, and the increase in the distance between the processor nodes PN as the number of processor nodes PN of the parallel processor system increases is small compared to the parallel bus type and the ring type connection formats described above. However, the distance between the processor nodes PN still increases as the number of processor nodes PN of the parallel processor system increases, and for a large number of processor nodes PN, there was a problem in that it is impossible to realize an optimum structure for the entire parallel processor system, both costwise and technically.
Accordingly, it is a general object of the present invention to provide a novel and useful parallel processor system in which the problems described above are eliminated.
Another and more specific object of the present invention is to provide a parallel processor system having a mutual connection network with an optimum connection format, both costwise and technically.
Still another object of the present invention is to provide a parallel processor system comprising a pair of parallel buses, pipeline buses, a plurality of processor nodes having functions of carrying out an operation process in response to an instruction and transferring data, cluster switches having a plurality of connection modes and controlling connections of the parallel buses, the pipeline buses and the processor nodes, and a switch controller controlling the connection mode of the cluster switches and coupling the processor nodes in series and/or in parallel. According to the parallel processor system of the present invention, it is possible to provide a parallel processor system having a mutual connection network with an optimum connection format, both costwise and technically.
Other objects and further features of the present invention will be apparent from the following detailed description when read in conjunction with the accompanying drawings.