1. Field of the Invention
The present invention relates to a technology for analyzing performance of a program that is run on a processor.
2. Description of the Related Art
Most current processors have a hardware counter for monitoring performance by counting events in the processor and events of interactions with the exterior.
For example, a Pentium (registered trademark) processor of Intel Corporation has a plurality of counters and is configured to be capable of selecting an event from various events such as a clock frequency, the number of executive instructions, and the number of cache mistakes, and to count the selected event. A Performance Optimization With Enhanced RISC for Personal Computer (power PC) processor from International Business Machines Corporation is configured similarly and is capable of selecting a counter from plural counters to count an event.
Therefore, program running state information, such as the number of execution cycles, and bottleneck information, such as the number of cache mistake cycles, can be obtained. Thus, a mechanism to provide information useful for improving software programs and performance is incorporated as hardware in a processor.
For example, a technique utilizing a hardware counter in a processor is disclosed (for example, Japanese Patent Laid-Open Publication No. 2004-318538). A technique that displays an event of software by generating the event and tracing the event to detect a bottleneck of software or a system is disclosed (for example, Japanese Patent Laid-Open Publication Nos. H9-34850, H6-83608, and H5-35549). In the technique disclosed in Japanese Patent Laid-Open Publication Nos. H9-34850, H6-83608, and H5-35549, events of software (task names, function names, etc.) are handled.
Generally, a hardware counter is incorporated in a processor as a dedicated circuit and this dedicated circuit has a simple configuration only for accumulatively counting hardware event signals to save the area in the processor. Therefore, information acquired from the hardware counter (architecture information such as a pipeline stall, memory traffic, bus load information, etc.) is output as accumulated information for a specific section. Therefore, although information on the entire specific section can be acquired, information at micro intervals can not be acquired.
This acquired information can not be screened to a “point”. The acquired information is section information as a “plane”. Therefore, tuning and feedback to the system design based on the acquired information lack concreteness and tend to be ambiguous and abstract. Therefore, it is difficult to probe a tuning point.