Parallel computer systems that distribute a calculation process to multiple computational nodes or multiple processor cores for parallel execution of the calculation process have conventionally been used.
Techniques for evaluating the number of such parallel calculations and efficiency thereof have been known. As an example of the techniques, there is a known technique which runs a simulator based on an execution result (trace) of a user program, determines a calculation time and a communication time, and evaluates the number of processors. The technique is directed to a parallel computer including a plurality of processors and operating on message communications. There is also a known technique that executes a process on a per-instruction-cycle basis by processor emulation, measures an execution status at a hardware level and an accessing-state time for each of communication paths, and evaluates system efficiency. There is also a known technique that obtains information about processing speeds of processors.
Related-art examples are discussed in Japanese Laid-open Patent Publication No. 06-59939, Japanese Laid-open Patent Publication No. 05-88912, and Japanese Laid-open Patent Publication No. 2006-53835.
Some parallel computers attempt speedup by executing calculation and communications in parallel, or, put another way, by causing calculation and communications to overlap. When calculation and communications are overlapped in this manner, memory accesses for calculation and communications may bottleneck processing speed. In other words, there is a problem that performance is degraded by conflict resulting from simultaneous execution of the calculation and communications.