In recent years, researches have been carried out for high-speed operation of a program (hereinafter, may be referred to as an application program) that operates in a large-scale parallel computer system (hereinafter, may be referred to as a high performance computing (HPC) system).
Specifically, a method of effectively using a cache of a central processing unit (CPU), for example, is studied as a method for high-speed operation of an application program in the HPC system. In this case, a researcher or the like (hereinafter, may be simply referred to as a researcher) of the HPC system, for example, causes the application program to operate in the HPC system and acquires profile data that includes the status of use of the cache by the application program in operation. The researcher, for example, interprets the acquired profile data to find a method for effectively using the cache. Japanese Laid-open Patent Publication No. 8-241208 is an example of the related art. Siddhartha Chatterjee, Erin Parker, Philip J. Hanlon, and Alvin R. Lebeck, “Exact analysis of the cache behavior of nested loops” In Cindy Norris and Jr. James B. Fenwick, editors, Programming Language Design and Implementation (PLD1-01), volume 36.5 of ACM SIGPLAN Notices, Pages 286-297, N.Y., Jun. 20-22, 2001, ACMPress is another example of the related art.
However, in the case of creating such profile data, the researcher may desire to use the HPC system (hereinafter, may be referred to as a real machine) for a long period of time. Thus, the researcher may not sufficiently create the profile data with use of the real machine depending on constraints of time or the like under which the real machine may be used.
Meanwhile, there exists a method of creating profile data by using a simulator of the HPC system. In this case, the researcher executes an application program in the simulator and collects information to be used for creation of the profile data. Accordingly, the researcher may acquire the profile data without using the HPC system for a long period of time.
However, the execution speed of the simulator may be significantly lower than the execution speed of the real machine. Thus, the researcher may not efficiently create profile data in the case of creating the profile data by using the simulator.
Therefore, according to an aspect, it is desired to provide a storage medium storing a cache miss estimation program, a cache miss estimation method, and an information processing apparatus that may efficiently acquire information related to a cache miss occurring during execution of a program.