In the quest for continued improvements in efficiency and utilization of data processing systems, various types of data monitors have been developed to aid a user in understanding what is happening `under the covers` of these systems. Data processing system resources of interest comprise such things as random access memory (RAM) usage, peripheral device usage, and central processing system (CPU) busy/idle time. These resources can give an operator of a data processing system key information on fine tuning the various system parameters to achieve a higher efficiency of the data processing system's overall throughput.
Users of operating systems need information on how much memory is being used. Information on memory utilization, especially memory Working Set, is useful for showing if the physical memory in the computer is sufficient for the currently active applications. Insufficient memory allocation can cause inefficient system operations due to excessive swapping or paging that can occur based on this insufficiency. Prior products analyzing system memory generally take on the order of 15-45 seconds to execute, depending on the actual amount of system memory exists in the system. The information presented is useful for determining the RAM consumed by an individual application, but only when intrusiveness of the tool, on the system being monitored, is not a factor. These prior products use a text screen. Other previously reported techniques and tools have relied on specialized hardware assistance for measuring or calculating RAM usage.
Other techniques have been used to measure other types of data processing system resources. Direct internal monitoring, by the system itself, is one technique known to exist. These techniques typically consume large percentages of the data processing system's own resources in capturing data, and write the captured data to some type of mass storage device. Then a subsequent procedure is used to read and analyze this data (i.e. analysis not in real time).
Device utilization for peripheral devices has historically been measured directly by precisely measuring the start-time and complete-time for each I/O (input/output). This allows a calculation of the individual I/O times. By summing these I/O times over a given period it was possible to calculate total busy time. Then, device utilization is calculated by dividing total busy time by total elapsed time. This approach creates two problems. First, it requires that the entity directly in control of the I/O (usually, either the device hardware and/or operating system) measure and record the I/O start/stop times. Next, it requires a hardware timer with sufficient resolution to accurately time these I/O events. On some systems, for example personal computers, neither of these criteria are met. In other words, the hardware or operating system does not measure I/O time. Further, the hardware timer is of such poor resolution (32 milliseconds) in many of today's personal computers that accurate I/O timings cannot be made. Thus, for existing personal computer systems, device utilization is not obtainable using these conventional methods.
CPU idle time in a data processing system is the amount of time the computer s Central Processing Unit (CPU) is not being utilized by any task. Previous methods for measuring CPU idle time used a thread to perform a series of tasks. The number of tasks the thread performed was then compared with a hypothetical number of tasks that could have been performed, if the thread was allowed all available CPU time. This procedure is lacking in that the hypothetical number of tasks is different on different data processing systems. A system specific calibration algorithm is required to determine the minimum time the task(s) required to execute. This calibration method can be unreliable and presents many practical problems when moving between systems.
In general, the above types of systems are further lacking in that as performance data is gathered, it is written by the gathering system to a relatively slow mass storage device for further analysis. This is because the methods for capturing the data operate much faster than the methods used to analyze the data. Thus, the mass storage device is used as a buffer to allow the methods to operate at different operational speeds. Furthermore, the data generated by the data gathering system is of such a voluminous nature that the analysis method is unable to manage or maintain the large quantity of data. This constraint additionally required storage to an intermediate mass storage device.
As a result of this intermediate buffering, the analysis cannot be performed in real time, but rather is delayed. As such, any reports or other types of feedback of system performance and operation are chronically out of date with the actual performance. As today's data processing systems are supporting more complex operating environments, including support for multi-tasking and multi-users, this delay in performance data may cause critical system bottlenecks to continue unreported as the cause of any problem may have come and gone before being detected.
Other methods used to analyze the data require a significant amount of the gathering system's resources, such as the CPU. As a result, the analysis cannot be done in real time, as the analysis consumes such a large percentage of the resources, as it would bias the data to not be meaningful of the underlying system operation.
Some systems have attempted to overcome the above limitations, but in doing so have failed to maintain or capture information at a process level of a multi-processing system. Rather, overall system usage can be monitored, with no ability to focus on a particular process that may be causing the system to be performing poorly. This failure of process resolution results in showing that an overall system may be performing poorly, but no meaningful indication of which process in the system is the culprit.