Field of the Invention
The present invention relates to a computer and method of a simultaneous linear equation solver using a memory-distributed parallel processor capable of quickly solving simultaneous linear equations in a multiprocessor system which performs its processes through communications among a plurality of processors.
A blocked LU decomposition method is followed using outer products as an algorithm of solving simultaneous linear equations for a parallel process. FIG. 1 shows the outline of the blocked LU decomposition method using the outer products.
Gaussian d indicates the width of a block. The following processes are performed in this method.
The k-th process updates update portion A.sup.(k) by the following equation . EQU A.sup.(k) =A.sup.(k) -.multidot.L2.sup.(k) -U2.sup.(k) ( 1)
In the (k+1)th process, A.sup.(k) is divided by the block width d and a matrix smaller by d is updated by the same equation.
L2.sup.(k) and U2.sup.(k) should be calculated by the following equation.
When equation (1) is used for update, data is decomposed as follows. EQU B.sup.(k) =((L1.sup.(k)).sup.T,(L2.sup.(k)).sup.T).sup.T U1.sup.(k)
Then, the data is updated as follows. EQU U2.sup.(k) =(L1.sup.(k)).sup.-1 U2.sup.(k)
where L1(.sup.k) indicates a lower triangular matrix after the LU decomposition, while U1.sup.(k) indicates an upper triangular matrix.
When the blocked LU decomposition method is followed in a memory-distributed parallel processor using outer products, data should be efficiently distributed to the memory of each processor and object data should be efficiently switched among the processors. Conventionally, blocked data is sequentially positioned in each processor to simplify user interface. Therefore, the LU decomposition load of each processor is not necessarily assigned equally. Furthermore, the parallelism of the data communications among the processors is inefficient, thereby undesirably increasing a communications cost.
Quickly solving simultaneous linear equations is an essential application of computers. To solve the equations more efficiently through a massively parallel processor requires not only a efficient method but also efficient parallelism incorporated into the characteristics of the massively parallel processor.
A high-performance CPU and a large-scale memory system are required to solve a large volume of simultaneous linear equations. To quickly solve the simultaneous linear equations using a memory-distributed multiprocessor, the data should be efficiently assigned and transmitted to the memory of each processor.
Additionally, the user interface (application interface of the host computer) should be implemented without a complicated configuration to solve various problems.