The present invention relates to an apparatus for computing numerical solutions of simultaneous linear equations, and in particular, to such an apparatus for implementing pre-processing or pre-conditioning of a coefficient matrix of the linear equations as a vector computer conducting a large-sized numerical simulation, a computer accomplishing concurrent or parallel processing, or a workstation.
A structure for conducting a matrix operation and a physical system represented by linear equations have been described in the U.S. Pat. Nos. 4,787,057 and 4,697,247.
A conjugate gradient (CG) method, pre-conditioning by an incomplete LU decomposition or factorization, and a method of solving simultaneous linear equations according to a conjugate gradient method have been described in such articles as I. Gustafsson, "A Class of First Order Factorization Methods", BIT 18 (1978), pp. 142-156; H. A. van der Vorst, "ICCG and Related Methods for 3D Problems on Vector Computers", Comp. Phys. Communications, 53 (1989), pp. 223-235; H. A. van der Vorst, "BI-CGSTAB: A First and Smoothly Converging Variant of BI-CG for the Solution of Nonsymmetric Linear Systems", pp. 1-16; and Sangback Ma, et al., "2 Iterative Methods . . . 3 Vectorization and Parallelization", Vol. 4, Winter 1990, pp. 12-24.
The method of analyzing linear equations in which the incomplete LU factorization is used in the preconditioning of the equations cannot be easily applied to a computer having a plurality of vector processing units or a super-parallel computer achieving an extreme number of parallel computations.
In a triangular factorization of a matrix A, the matrix is decomposed into a lower triangular matrix L and an upper triangular matrix U, thereby expressing the matrix as a product LU (=A). Achieving a discrete approximation according to the finite element method on a quantity or an area representing a phenomenon expressed by partial differential equations, there are attained simultaneous linear equations having a sparse coefficient matrix. Namely, the matrix has a low ratio of non-zero elements to the elements thereof. In contrast thereto, the ratio of nonzero elements in the matrix thus decomposed into the L and U portions is increased. The generation of nonzero elements is ordinarily called "fill-in". In the incomplete LU factorization, the process can be achieved by ignoring the fill-in areas (approximated to zeros). Consequently, the L and U matrices obtained by achieving the incomplete LU decomposition on all nonzero elements of the coefficient matrix respectively have the same structures as the Structures respectively of the lower and upper triangular portions of the matrix A.
According to an advanced method developed to apply the method above to a computer having a plurality of vector processing units, the incomplete LU factorization cannot be easily applied to the plural vector processors. Namely, in an iteration for the number of two-dimensional lattice or grid points in a region subdivided into m.sub.x by m.sub.y, the vector length is limited to m.sub.x.
According to the method in which the incomplete LU factorization is employed for conjugate gradient (CG) series to conduct iterative computations to solve linear equations, the correction (1+.delta.) of the Ivar Gustafsson type is used in the incomplete LU decomposition.
In the incomplete LU decomposition above; the degree of parallelization n.sub.x .multidot.n.sub.y in the three-dimensional processing is considerably smaller than the order or dimensionality n=n.sub.x .multidot.n.sub.y .multidot.n.sub.z of the linear system. Moreover, the degree n.sub.x in the two-dimensional processing is remarkably smaller than the order n=n.sub.x .multidot.n.sub.y. Heretofore, there have been proposed methods in which without using the incomplete LU factorization, a plurality of matrices are multiplied by each other in the preconditioning. However, according to these methods, the convergence speed of numerical solutions of the linear equations is lowered when compared with the solution adopting the incomplete LU factorization, which leads to a disadvantage that when the property of the coefficient matrix is deteriorated (i.e., when an ill condition exists), the convergence of solutions becomes to be unstable.
Moreover, in a case where the calculation of conjugate gradient series is utilized in the iterative computation for solution of linear equations and the conventional incomplete LU decomposition is employed in the preconditioning step, when a matrix obtained by discretizing the quantity represented by a diffusion convection equation in accordance with the calculus of finite differences has a deteriorated property (e.g., the cell peclet number is large), the conversion speed is lowered. In consequence, with a trifle error in setting the parameters, the conversion of solutions cannot be realized in many cases.