Conventionally, there has been a technology that executes a matrix operation to generate values of the elements of a predetermined matrix in parallel by multiple threads. For example, to perform a matrix operation to generate values of the elements of a matrix, multiple matrix operations equivalent to the single matrix operation may be executed in parallel by multiple threads to generate values of the elements of multiple submatrices of the matrix partitioned to have nearly the same dimension in the row direction or in the column direction.
As prior art, for example, there has been a technology of multiplication of matrices that multiplies in parallel partial row vectors obtained by partitioning the rows of one matrix, by partial column vectors obtained by partitioning columns of the other matrix, and adds the multiplication results to output the result by partial sum-of-product operations. Also, for example, there has been a technology of LU decomposition that adjusts the number of block stages based on the size of an LU decomposition so as to execute remaining LU decompositions at a high speed by vector operations.
[Related-Art Documents]