1. Field of the Invention
The present invention relates to information processing systems and, more particularly, to software tools and methods for monitoring, modeling, and enhancing system performance.
2. Description of Related Art
To enhance system performance, it is helpful to know which modules within a system are most frequently executed. These most frequently executed modules are referred to as "hot" modules. Within these hot modules, it is also useful to know which lines of code are the most frequently executed. When there is a point in the code where one of two or more branches may be taken, it is useful to know which branch is the mainline path, or the branch most frequently taken, and which branch or branches are the exception branches.
A programmer hoping to improve system performance should focus his or her efforts on improving the performance of the hot modules. Improving the performance of the most frequently executed modules will have the most effect on improving overall system performance. It does not make sense to spend much time improving the performance of modules which are rarely executed, as this will have little, if any, effect on the overall system performance.
A programmer hoping to improve the performance of a module will group the instructions in the mainline branches of the module closely together. Keeping the mainline code packed closely together increases the likelihood of cache hits, since the mainline code is the code that will most likely be loaded into the instruction cache.
Performance tools are used to examine program code to determine the most frequently executed modules and instructions in a system. Performance tools may be implemented in hardware or software. Hardware performance tools are usually built into the system. Software performance tools may be built into the system or added at a later point in time. Performance tools implemented in software are especially useful in systems, such as personal computer systems, that do not contain many, if any, built-in hardware performance tools.
Some prior art software performance tools use an interrupt-driven method to monitor performance. Typically, the system is interrupted at set time intervals. At each interrupt, the performance tool samples the code that is running and adds data into a log.
There are several problems with this prior art approach. Because the code is sampled once per interrupt, the programmer never sees any data pertaining to code that is "disabled for interrupts" (i.e. code where interrupts are masked). The interrupt that stops the system and allows the performance monitoring to take place can never occur during code that is disabled for interrupts.
Another problem with this prior art approach is that the storage area quickly fills with data. This means the performance tool can only be run for a very short period of time, or the tool must stop the system to unload the data into another space, such as to disk. Stopping the system is very intrusive, as this type of stop would not occur during normal operations. Thus, stopping the system to unload performance data actually affects the system performance of the system that is being monitored.
Furthermore, sampling once per interrupt gives a ragged view of the performance data. It is difficult to accurately understand what is happening in the system because the performance data is collected at random points. There is no data collected pertaining to a sequence of instructions running in the consecutive order in which they were intended to execute.
Another type of prior art software performance tool keeps track of sequences of instructions by logging every instruction as it executes. However, there are two problems associated with this type of prior art approach. First, the storage area fills with data even more quickly than with the interrupt-driven performance tool. Therefore, this type of tool can only be run for very short periods of time, or the data must be unloaded to another storage area so often that the tool becomes prohibitively intrusive. Second, there is a danger that the data collected will not be representative of the system as a whole. For example, a branch instruction that can take either path one or path two, may take path one during the time the performance tool is monitoring the system. Thus, to the programmer, it will appear that path one is the most frequently executed path. This may or may not be true. It could be that path two is much more frequently executed, but because the performance tool is run for such a short period of time, and because the branch happened to follow path one during the short time the system was monitored, the programmer will be misled.
The recent growth in multi-processor systems has created another problem for both types of prior art performance tools. Prior art performance tools usually use one storage area to collect performance data, regardless of the number of processors in the system. Thus, in a multi-processor system, some type of serialization must be used to ensure that more than one processor is not writing to the storage area at the same time. In addition, this may lead to inaccuracies in the performance data. While one processor is writing its performance data, a second processor may be waiting to write its performance data, and this waiting time will be reflected in the second processor's performance data. This wait time is not part of normal processing.
Another problem with many prior art performance tools implemented in software is that they require changes to be made to both operating system and application code. All code must be recompiled before any performance monitoring takes place. Thus, the changed and recompiled code is the code that is actually monitored, rather than the actual code that eventually becomes part of the final software product.
Consequently, it would be desirable to have a minimally-intrusive system performance tool that would not require a large memory area to store data and that would accurately depict performance for both single and multi-processor systems. It would be desirable to have such a performance tool monitor system performance with minimal changes to the operating system and no changes to application code that is being monitored. In addition, it would be desirable to have such a tool both identify the hot modules within a system and identify the mainline code paths through the system.