Systems for monitoring computer system performance are extremely important to hardware and software engineers. Hardware engineers need systems to determine how new computer hardware architectures perform with existing operating systems and application programs. Specific designs of hardware structures, such as memory and cache, can have drastically different, and sometimes unpredictable utilizations for the same set of programs. It is important that any flaws in the hardware architecture be identified before the hardware design is finalized.
Software engineers need to identify critical portions of programs. For example, compiler writers would like to find out how the compiler schedules instructions for execution, or how well the execution of conditional branches are predicted to provide input for code optimization.
It is a problem to accurately monitor hardware and software systems performance. Known systems typically are hand crafted. Costly hardware and software modifications may need to be implemented to ensure that system operations are not affected by the monitoring systems.
Many monitoring systems are known for different hardware and software environments. One class of systems simply counts the number of times each basic block of machine executable instructions is executed. A basic block is a group of instructions where all the instructions of the group are executed if the first instruction of the group is executed. The counts can be studied to identify critical portions of the program.
Monitoring references to instructions and data addresses are usually performed by tracing systems. Data address traces can be used to improve the design of caches, and increase the efficiency of in-memory data structures. Instruction address traces can identify unanticipated execution paths.
In another class of systems, the simulated operation of the computer system is monitored. Simulators attempt to mimic the behavior of computer systems without actually executing software in real time.
There are problems with traditional monitoring systems. Most systems monitor a limited number of specific system characteristics, for example, executed instructions or referenced data. It is difficult for users to modify such systems for other purposes. Building specialized systems is not a viable solution since the number of system characteristics to be monitored is large and variable. If the performance data supplied by the monitoring system is less than what is desired the system is of limited use. If the system supplies too much performance data, the system is inefficient.
Most monitoring systems which count basic blocks accumulate counts for all the blocks of the program. Other than by tedious modifications, it usually is not possible to monitor selected blocks of interest.
Most known tracing systems gather detailed address data inefficiently. A typical trace for a small program can include gigabytes of trace data. A user interested in monitoring just the branch behavior of a program has to sift through entire traces just to find, for example, conditional branch instructions.
Simulating the execution of programs at the instruction level can consume enormous quantities of system resources. In addition, it is extremely difficult to accurately simulate the hardware and software behavior of a complex computer system. Simulated performance data does not always reliably reflect real run-time performance.
There also are problems with the means used to communicate performance data. Most systems use expensive inter-processor data communications channels to communicate performance data. Inter-processor communication channels are generally inefficient and may disturb the processing environment being monitored. Some systems make difficult modifications to the operating system to improve the efficiency of monitoring computer systems. Furthermore, unfiltered performance data can consume large quantities of disk storage space.
There is a need for a flexible and efficient monitoring system which can easily be adapted to a diverse set of monitoring tasks, ranging from basic block counting to measuring cache utilization. The information data should be precise, and reflect the actual operation of the computer system.