Collecting performance data in an operating computer system is a frequent and extremely important task performed by hardware and software engineers. Hardware engineers need performance data to determine how new computer hardware operates with existing operating systems and application programs.
Specific designs of hardware structures, such as processor, memory and cache, can have drastically different, and sometimes unpredictable utilizations for the same set of programs. It is important that flaws in the hardware be identified so that they can be corrected in future designs. Performance data can identify how efficiently software uses hardware, and can be helpful in designing improved systems.
Software engineers need to identify critical portions of programs. For example, compiler writers would like to find out how the compiler schedules instructions for execution, or how well execution of conditional branches are predicted to provide input for software optimization. Similarly, it is important to understand the performance of the operating system, kernel, device driver, and application software programs.
It is a problem to accurately monitor the performance of hardware and software systems without disturbing the operating environment of the computer system. Particularly, if the performance data is collected over extended periods of time, such as many days, or weeks. In many cases, performance monitoring systems are hand crafted. Costly hardware and software modifications may need to be implemented to ensure that operations of the system are not affected by the monitoring systems.
One way that the performance of a computer system can be monitored is by using performance counters. Performance counters "count" occurrences of significant events in the system. Significant events can include, for example, cache misses, instructions executed, I/O data transfer requests, and so forth. By periodically sampling the performance counters, the performance of the system can be deduced.
In addition, to sampling the actual events, it would also be useful to know the exact instruction or data accessed which is associated with the event. However, with most performance counters, the program counter (pc) value that is available on the interrupt, is the pc of the next instruction to be executed (the interrupt return address) after the interrupt which samples the counter completes processing. In most cases, the next instruction to be executed after the interrupt returns is not the instruction that caused the event that caused the interrupt, rather some later instruction.
Therefore, it is desired to directly determine locations of instruction or data that when accessed are indicative of significant processor events.