As a method of solving eigenvalue problems of real symmetric matrices, there is a method in which tridiagonalization is performed so that the eigenvalue problems of real symmetric matrices are converted into eigenvalue problems of tridiagonal matrices. This method is costly in calculations for tridiagonalizing real symmetric matrices. Shared memory scalar parallel computers, which are less capable of accessing memory than vector computers, have improved algorithms in order to increase the amount of computation processed in units of memory accesses so that tridiagonalization is facilitated.
Patent Document 1 describes fundamental processing for block tridiagonalization.
In this method, however, the greatest part of the calculation cost consists of calculations of matrix vector products, and also matrix vector products greatly influence the memory access speed. Thus, this part of the calculation cost needs to be modified in order to improve the entire performance.
In a blocking method by which the computation amount is increased, it is important to calculate matrix vector products by making consecutive accesses and to perform those calculations parallelly and evenly by using respective CPUs. For this purpose, the updating of the entire matrix consists of the updating of the lower triangular portion and the copying of the upper triangular portion. In order to distribute the updating loads evenly, a matrix is divided into double of the number of CPUs in the column direction, and the computation amounts for updating are assigned to the respective CPUs in pairs including the i-th CPU and the “2×#CPU-(i−1)-th” CPU. However, this method is disadvantageous in that the memory areas to be referred to by the calculations of matrix vector products are distant so that the data in such areas interfere with data in cache, resulting in difficulty in storing the data in cache.
Finding a method that can harmonize the updating based on matrix vector products/matrix products with the load distribution and that can enable high-speed operation of matrix vector products is expected to lead to great improvement in performance.
Non Patent Document 1 and Non Patent Document 2 disclose a fundamental algorithm for tridiagonalization and parallel processing of tridiagonalization, respectively.    Patent Document 1:    Japanese Laid-open Patent Publication No. 2004-5528    Non Patent Document 1    G. H. Golub, C. F. van Loan, Matrix Computation Second Edition, Johns Hopkins University Press 1989    Non Patent Document 2    J. Choi, J. J. Dongarra, and D. W. Walker, “THE DESIGN OF A PARALLEL DENSE LINEAR ALGEBRA SOFTWARE LIBRARY: REDUCTION TO HESSENBERG, TRIDIAGONAL, AND BIDIAGONAL FORM”, Engineering Physics and Mathematics Division, Mathematical Sciences Section, prepared by the Oak Ridge National Laboratory managed by Martin Marietta Energy System, Inc., for the U.S. DEPARTMENT OF ENERGY under Contract No. DE-AC05-84OR21400, ORNL/TM-12472.