1. Field of the Invention
The present invention is generally related to general purpose, stored program, digital computers and more particularly relates to an efficient means for monitoring the performance of various portions of a computer system.
2. Description of the Prior Art
The term "performance monitoring" refers to the process of monitoring the performance of various system components within a computer system while the computer system is operating under normal operating conditions. Performance monitoring is a key factor in the operation and maintenance of many of today's complex computer systems.
In the past several decades, the demand on computer systems has steadily increased. Today's software packages require much more processing power and storage capacity than those produced just a few years ago. In addition, many more people are using computers to do tasks that were traditionally done using other means. Because computer systems remain relatively expensive to purchase and maintain, many end users are operating their computer systems at a much higher capacity than in the past. This increased demand results in a higher probability that performance problems will occur in a given system.
Many factors may reduce the optimal performance of a computer system. First, there may be a bottleneck at the input/output interface causing the CPU to idling a substantial portion of time waiting for data. Second, the users of a system may routinely execute a particular computer program. If the system is not configured properly, the system may be required to load the computer program from an external disk into internal memory each time the program is executed thereby unnecessarily slowing down system performance. In this example, system performance could be increased by recognizing that this is occurring, preferably by using performance monitoring techniques, and changing the system's configuration to keep the particular computer program in the computers internal memory during peak usage periods. Finally there may be not enough internal memory within the computer system to store all of the computer programs that are to be simultaneously executed by the users. This can result in "disk swapping". Disk swapping occurs when internal memory limitations require a computer program or the resulting data from the computer program to be loaded and unloaded from an external storage disk each time a process becomes active. Disk swapping can also occur when a single process is executing. Disk swapping can especially be a problem in multi-user systems and systems that utilize re-entrant computer programs.
The above examples are given only to illustrate the necessity for performance monitoring techniques within a computer system and are not intended as an exhaustive list. It is recognized that many other performance inhibitors exist in modern computer systems and that many of them may be detected by using performance monitoring techniques.
Another, less obvious, motivation for monitoring the performance of a computer system is to debug a particular system during system development or to debug a particular software program during software development. Often it is unknown where the bottlenecks are likely to occur within a computer system or software program that is under development. Performance monitoring techniques can be used to produce data that can be statistically analyzed to provide computer designers and software developers insight into where in the computer system future bottlenecks or problems are likely to occur.
Performance monitoring of today's computer systems is typically provided by using off the shelf software packages. Examples of such off-the-shelf performance monitoring software packages include CMF baseline, the Torch program available from Datametrics, the SIP Database written by Structural Metals Inc. and available through the USE Program Library Interchange (UPLI), the ALICE module of the SYSTAR products, and the Online Activity Monitor (OSAM) available from TeamQuest. These software packages are executed on a particular computer or computer network and generate performance data based on a number of preselected factors. One such method is discussed in "Getting Started in 1100/2200 Performance Monitoring", by George Gray, UNISPHERE Magazine, November 1993.
These off the shelf software packages may prove to be useful for some users but they are not an ideal solution for others. Problems that exist with these software packages include: (1) only the performance parameters selected by the software developer are available to the user; (2) the software packages are typically only available for standard computer systems and therefore cannot be used during the development stage of a computer system or on less known computer systems without independent development of the performance monitoring software; (3) the software packages are typically run concurrently with and on the same CPU as the user software and therefore may slow down systems performance while the performance monitoring software is executed; and (4) only hardware that is accessible by the software package, like CPU activity and I/O requests, can be monitored by these software packages.
Problems (1) and (2) listed above may be minimized by having the user write a customized performance monitoring software package for the user's system. However, this requires a significant investment in resources to develop such a program. Problems (3) and (4) listed above cannot typically be eliminated by having the user write a customized software package for several reasons. First, only the nodes within the computer system that are accessible to the performance monitoring software can monitored. This limitation is a result of having the performance monitoring strategy determined after the computer hardware is designed. Many nodes within a computer system are neither controllable nor observable via software. Second, the performance monitor software is run on the same CPU as the user programs and therefore may decrease overall system performance. Finally, since the performance monitoring software may effect the performance of the system in which the software is attempting to measure, the overall accuracy of the results obtained by the performance monitoring software packages may be limited.