This invention relates to a parallel arithmetic-logic processing device which may be conveniently employed as a computer for executing arithmetic operations in the fields of science and technology. More particularly, it relates to a parallel arithmetic-logic processing device in which, since matrix data having a voluminous data volume is frequently employed in the scientific and technical arithmetic/logic operations, a memory device for continuous data writing and reading is provided as storage means for writing and reading the matrix data for reducing the processing time and production costs.
It is not too much to state that recent development in technology is ascribable to enablement of the complex and large-scale scientific and technological calculations. In the course of the scientific and technological calculations, matrix operations, such as solution of extremely large size first-order simultaneous equations or calculation of eigenvalues of the matrices occur frequently. For the size (row by column) of the matrices of n.times.n, for example, the volume of the processing operations becomes equal to O(n.sup.3) for any of the calculations of the first-order simultaneous equations having the matrices as the coefficients, calculations of the inverse matrix, calculations of the eigenvalue of the matrix or the calculations of the eigenvector. Thus, if the number of the rows and/or the columns, that is the value of n, is increased, the volume of the processing operations is necessarily increased.
It is up to the today's computer to execute the voluminous operations quickly.
Although the processing speed of the central processing unit (CPU) of the up-to-date computer has been increased tremendously, it takes some prolonged time to have access to the memory, thus retarding the overall processing speed of the computer.
For possibly reducing the time necessary for having access to the memory, a computer has been developed in which a subsidiary memory known as a cache memory 102 capable of high-speed access despite its smaller storage capacity than that of a main memory 103 is provided between the CPU 101 and the main memory 103, as shown in FIG. 14.
With such computer, the necessary data is previously read out from the main memory 103 to the cache memory 102 over a bus line 105 so that the necessary data is read from the cache memory 102. Since the cache memory 102 permits of high-speed accessing, the processing operations may be executed more quickly.
On the other hand, there is known a so-called supercomputer having plural vector registers 117 for executing the same processing operations on lumped data, such as matrix operations, as shown in FIG. 15. For directly processing data contained in the vector registers 117, plural registers capable of directly accessing an adder for floating decimal point additive operations 110, a multiplier for a floating decimal point multiplicative operations 111 and a divider for a floating decimal point dividing operations 112 are arrayed in a one-dimensional pattern.
Before the above-described supercomputer proceeds to the processing operations, data is pre-loaded from the main memory 121 to the vector registers 117 via data lines 118, 120 by the operations of a vector input/output circuit 119. The data loaded into the vector registers 117 is supplied over an input busline 113 to the arithmetic-logic units 110 to 112 to execute the processing operations. The processed data is supplied over an output busline 114 to the vector registers 117 so as to be re-written therein. The processed data, thus re-written in the vector registers 117, are read out by the vector input/output circuit 119 so as to be stored in the main memory 121.
With the supercomputer, arithmetic-logic operations may be executed in a lumped fashion by the vector registers 117. Besides, high-speed processing is rendered possible because of the pipelined operation up to storage of the processed data in the main memory 121.
However, with a computer employing such cache memory 102 as shown in FIG. 14, part of the entire data is stored in the cache memory 102, while the remaining data is stored in the main memory 103 in carrying out the processing operations. Consequently, if data necessary for the processing operations is not stored in the cache memory 102 by so-called mis-hit, it becomes necessary to read out data stored in the main memory 103 while the CPU 101 is in the stand-by state. As a result thereof, large-scale calculating operations cannot be executed quickly.
Besides, with the supercomputer shown in FIG. 15, the vector registers 117 are generally limited in capacities, so that, if desired to execute large-scale processing operations, it is not possible to store the entire data in the vector registers 117. Thus the data that cannot be stored in the vector registers 117 need to be read from the main memory 121 for storage in the vector registers 117. Since a lot of time is involved in reading out and storing the excess data, large-scale calculating operations cannot be executed quickly.
Meanwhile, the above problems may be overcome by providing, as the cache memory 102 and the vector registers 117, a subsidiary memory of a large storage capacity accessible at a high speed. However, such large capacity subsidiary memory accessible at the high speed is in need of a large mounting area and is expensive to increase the size and production costs of the computer.