I. Technical Field
The present invention relates to methods and systems for monitoring the use of a hardware component in a computing system.
II. Background Information
In a computing system, utilization of a piece of hardware refers to a measurement of the working rate or usage of the hardware. It is typically expressed as a percentage of the maximum working rate or maximum usage of the piece of hardware. For example, if a processor is said to be at sixty percent utilization, then it is working at sixty percent of its maximum working rate. At this rate, forty percent of the clock cycles of the processor are idle clock cycles and serve no useful purpose in the operation of the computing system. In other words, the processor could handle forty percent more processing. Similarly, if a computer memory is at five percent utilization, then ninety-five percent of its memory is unused and is available to store additional information.
In a computer system, utilization of hardware components such as a processor, a memory, or a communications link may be monitored to provide information about the operation of the computer system. This information is useful to administrators and users of computer systems. For example, utilization information may be used to determine whether there is a bottleneck in the computing system due to a piece of hardware. If the bottleneck is identified, then action can be taken to improve the overall operation of the system. Other reasons for obtaining utilization information include determining whether an existing computing system can cope with forecast increased load, load balancing a networked computing system to efficiently use existing hardware, and planning hardware upgrades.
An operating system loaded on the computing system typically provides functionality that allows for a determination of the utilization of various hardware components. One may collect utilization information for a piece of hardware by regularly sampling and storing a value representing its utilization. For example, its level of use may be sampled every six seconds. However, this method results in a large amount of stored data. For example, for just one piece of hardware, six hundred samples are taken and stored every hour. Over the period of one year this amounts to over five million sampled values. Accordingly, such a method is not suitable for monitoring numerous components simultaneously, or for use over an extended period of time because the volume of stored data becomes excessively large.
One approach for reducing the amount of stored information is to calculate an average of the samples taken over a given period of time, such as one hour, and store only the average value. However, this method has a major drawback. When monitoring a processor, if over a given hour fifty percent of the samples taken have a value of a hundred percent and the remaining samples have a value of thirty percent, then the average sampled value for that hour will be sixty-five percent. Thus, it would not be apparent to a person examining the average sample data that the processor was operating at one hundred percent capacity for half of an hour and was likely to have been delaying users or causing an overall delay in a larger computing system. In a busy business environment, a performance degradation of thirty minutes can have a serious impact on the business. Accordingly, such an approach does not provide a satisfactory solution.