Along with the advances in technology, multi-core processors (for example, central processing units (CPUs)) having multiple cores, which are processing units, are taking hold in late years. Dual-core processors having two cores and quad-core processors having four cores are examples of multi-core processors. Further, as referred to as multi-core implementation, processors containing several tens of cores in a single large scale integrated circuit (LSI) have appeared, and thus it is predicted that LSIs with multiple cores will be increasingly developed in the future.
Furthermore, multi-node (process), multi-core systems formed with several thousand to several tens of thousands of computing nodes having such LSIs are now becoming the mainstream of supercomputers in the current high performance computing (HPC) field. Such supercomputers have conventionally been very involved with large-scale simulations in the weather forecast field, the biological field such as gene analyses, and the nanotechnology field, thus contributing to the development of various types of science and technology.
In addition, at the same time as the development of the multi-node, multi-core supercomputers, various large-scale simulation techniques in the scientific computing field are also developed. For example, a proposed technique ensures differences among bases (collections of discrete points, whose positions are used to calculate physical quantities) of individual simulation processes when the simulation processes are executed in parallel to each other.
In the implementation of multi-node systems, it is sometimes the case that the time of communication among processors interconnected to one another adversely affects the processing time of an application program which is an execution target. Therefore, it is important to reduce communication processing to thereby speed up the application program. One technique for processing speed-up is, for example, to prevent an increase in the communication load of a master unit by controlling, among slave units, boundary value data obtained from an analysis operation. Thus, computational algorithms of applications and parallelization techniques are improved for multi-node systems having multiple computing nodes to thereby promote processing efficiency. Please see, for example, Japanese Laid-open Patent Publications Nos. 10-154136 and 2002-123497.
However, conventional improvements of parallelization techniques are focused on speeding up parallel processing of processes executed by individual nodes of a multi-node system, and not sufficient studies have been taken on to speed up parallel processing of threads executed by individual cores in a multi-core processor. For example, multi-node parallel processing technology is applied to multi-core systems in such a manner that processing is equally distributed across individual cores (threads). However, no attention has been paid to speed-up of parallel processing which takes into account the characteristics of the multi-core systems.