1. Field of the Invention
The present invention relates to the collection, analysis, and management of system resource data in distributed or enterprise computer systems, and particularly to a system and method for reducing file sizes in an intelligent way.
2. Description of the Related Art
The data processing resources of business organizations are increasingly taking the form of a distributed computing environment in which data and processing are dispersed over a network comprising many interconnected, heterogeneous, geographically remote computers. Such a computing environment is commonly referred to as an enterprise computing environment, or simply an enterprise. Managers of the enterprise often employ software packages known as enterprise management systems to monitor, analyze, and manage the resources of the enterprise. Enterprise management systems may provide for the collection of measurements, or metrics, concerning the resources of individual systems. For example, an enterprise management system might include a software agent on an individual computer system for the monitoring of particular resources such as CPU usage or disk access. The enterprise management agent might periodically collect metric data and write to a xe2x80x9cdata spillxe2x80x9d containing historical metric data, i.e., metric data previously collected over a period of time. U.S. Pat. No. 5,655,081 discloses one example of an enterprise management system.
Historical data spills can be useful in a number of circumstances. First, even where an enterprise management system permits real-time monitoring of metric data, the enterprise is not always monitored for twenty-four hours a day, seven days a week. Thus, historical data spills provide a way to review metric data that was not monitored in real time. Second, regardless of whether metrics are monitored in real time, an enterprise manager may desire to review the history of one or more metrics which preceded a problem in another, related metric. Third, historical data spills can be used for analysis of the enterprise. For example, an analysis of the most frequent clients of a particular file server in the enterprise would utilize historical metric data. For these reasons, enterprise managers desire to keep track of as much historical metric data as possible. However, storage space and other resources are finite and not without cost. Therefore, the enterprise manager faces a trade-off between using costly storage resources on the one hand and throwing away meaningful metric data on the other hand. The object, then, is to reduce the amount of data stored while throwing out as little meaningful data as possible.
The prior art has produced a variety of compression techniques for reducing file size. Some compression methods are xe2x80x9closslessxe2x80x9d: they compress data by looking for patterns and redundancies, losing no information in the process. File-level and disk-level compression techniques for computer systems are lossless methods. Unfortunately, lossless methods typically achieve low compression rates, and so their usefulness is limited, especially for large, relatively pattenless spills of metric data. Other compression methods are xe2x80x9clossyxe2x80x9d: they typically achieve higher compression rates than lossless methods, but they lose information in the process. For example, techniques for compressing video and image data commonly eliminate pixel-to-pixel variances in color that are barely noticeable to the human eye. In other words, those methods determine the least necessary data by comparing pixels to one another, and then the methods discard that data. However, techniques for compressing metric data cannot so rely on the deficiencies of human perception. Often, compression techniques of the prior art compress metric data by decimating it: in other words, by simply throwing away every Nth element of a data spill, or by keeping every Nth element of a data spill. Decimation methods thus use a xe2x80x9cbrute forcexe2x80x9d approach with the result that the meaningful and the meaningless alike are discarded. The methods of the prior art employ a xe2x80x9cone size fits allxe2x80x9d methodology: they treat all bits and bytes the same, no matter what meaning those bits and bytes may hold. The methods do not look beyond the mere logical ones and zeroes to appreciate the significance of the data. Therefore, both the lossless and the lossy compression methods of the prior art are inadequate to solve the enterprise manager""s dilemma.
For the foregoing reasons, there is a need for a system and method for reducing file sizes in an intelligent way.
The present invention is directed to a system and method that solve the need for intelligent summarization of data. Preferably, the present invention provides improved management of collected metric data through summarization of data according to the semantics or meaning of the underlying data types, and also through summarization of data at a plurality of levels of varying granularity. In a preferred embodiment, the system and method are used in a distributed computing environment, i.e., an enterprise. The enterprise comprises a plurality of computer systems, or nodes, which are interconnected through a network. At least one of the computer systems is a monitor computer system from which a user may monitor the nodes of the enterprise. At least one of the computer systems is an agent computer system. An agent computer system includes agent software that permits the collection of data relating to one or more metrics, i.e., measurements of system resources on the agent computer system.
In a preferred embodiment, a Universal Data Repository (UDR) receives a set of data points from one or more agent computer systems. The set of data points is a series of metrics, i.e., measurements of one or more system resources, which have been gathered by data collectors on the agent computer systems over a period of time. The UDR preferably summarizes the set of data points into a more compact yet meaningful form. In summarization according to one embodiment, the UDR determines a data type of the set of data points, applies a summarization rule according to the data type, and then creates a summarized data structure which corresponds to the set of data points. The UDR may summarize multiple sets of data points in succession.
In one embodiment, the summarization rule varies according to the semantics, i.e., the meaning, of the data type. For example, if the data type of the collected metric data is a counter, i.e., a measurement that can only go up, then the summarized data structure will comprise the starting value, ending value, and total number of data points. On the other hand, if the data type of the collected metric data is a gauge, i.e., a measurement that can go up or down, then the summarized data structure will comprise the average of all the data points and the total number of data points. If the data type of the collected metric data is a clock, i.e., a measurement of elapsed time, then the summarized data structure will comprise the starting value, the ending value, and the frequency of the clock. If the data type of the metric data is a string, i.e., a series of characters which can be manipulated as a group, then the summarized data structure will comprise the first string. By applying different summarization rules keyed to different data types, the system and method preserve costly storage resources by taking the most meaningful information and putting it into smaller packages.
To decrease file size even further, in one embodiment the system and method also provide for multiple levels of summarization: as new metric data is received, previously received data is summarized into coarser data structures, wherein the degree of coarseness corresponds to the age of the data. After the metric data has been collected by an agent, the UDR summarizes raw data points into summarized data structures. Each summarized data structure corresponds to two or more of the raw data points. At later times, as new raw data is collected, the UDR summarizes the previously summarized data structures into still coarser summarized data structures. Each coarser summarized data structure preferably corresponds to two or more of the previously summarized data structures. The summarization of previously summarized data structures into coarser summarized data structures can be performed for any number of levels, as configured by the user. At each successive level of summarization, metric data becomes coarser in granularity: that is, the metric data representing a given period of time becomes more summarized and takes up less space.
In one embodiment, throughout the levels of summarization, the UDR preserves process state changes so that the record of a particular process is never totally lost. A process state change is the birth or death of a process at some point during the monitored time interval. In the preferred embodiment, furthermore, the UDR stores each level of summarization in a different file. In each file, the data points or summarized data structures are stored sequentially in order of collection. When one file fills up, i.e., reaches its maximum file size as configured by the user, the UDR summarizes the oldest data points or data structures in that file. The UDR then deletes the appropriate metric data from that file and pushes the newly summarized structure into the next coarsest file. When the coarsest file fills up, the oldest metric data structures from the coarsest file are deleted. The user may configure the number of levels of summarization and thus the number of files in the enterprise management system and method.