This invention relates to system metrics analysis. More particularly, this invention relates to data mining of computer performance data.
Companies that own and operate computers for data processing encounter a need for capacity planning of computing resources, so that they can efficiently and accurately plan the purchasing of new computing resources. Computing resources include CPUs, memory, disk storage, tape storage, access devices, operating systems, file systems, and many others. Capacity planning relies on the accurate forecasting of resource utilization. Forecasting, in turn, requires analysis of current and historical system performance metrics data. These metrics include CPU utilization, disk storage utilization, memory utilization, memory allocation, file system access, and many others.
There are several issues of concern with regard to capacity planning. It is important for companies to be able to determine points at which new hardware will become necessary to meet system requirements. It is also important for companies to be able to project scenarios for potential configuration changes including both hardware and software. Another issue of concern is the monitoring and analysis of performance problems.
To address these and other needs, data analysis/reporting tools for analyzing, reporting, and graphing system performance metrics for the purposes of capacity forecasting and planning are currently commercially available. However, due to the enormous number of performance metrics (over 500 for UNIX based systems), commercially available data analysis tools consume large amounts of computer system resources. Large databases are required to store performance metrics and as a result, the data analysis tools are inefficient.
Accordingly, there is a need for a tool that discriminates among performance metrics to select the most significant performance metrics for performing capacity planning and performance management.
A programmed digital processing apparatus is disclosed which executes a program of instructions to perform method steps for selecting forecasting performance metrics from among a plurality of performance metrics. The executed method steps include collecting performance metrics that relate to a node, e.g., a UNIX based midrange computer. The method also includes performing principal components and factor analysis on the performance metrics to identify one or more factors such that each performance metric is associated with a corresponding one of the factors. Further, the method includes, for each factor, selecting the performance metrics having a weight greater than a threshold figure in absolute value and storing the selected performance metrics in a database.
In accordance with different aspects of the invention, the factors may include system factors, disk factors, CPU factors, Swap factors, memory factors and process factors.
An advantage of the present invention is that it reduces system storage requirements by reducing the volume of performance metrics required to be archived in order to perform capacity planning.
Another advantage of the invention is that it codifies the major predictors of CPU consumption on UNIX based computers.
Still another advantage of the invention is that it provides a unique tool to allow analysts, performance engineers and/or system administrators to precisely identify the causes of various system conditions such as CPU consumption.
The details of the present invention, both as to its structure and operation, can best be understood with reference to the accompanying drawings, in which like reference numerals refer to like parts.