1. Technical Field
This invention relates to monitoring operation performance of hardware devices in a computer system. More specifically, the invention relates to adaptively monitoring and modifying the frequency upon which data is gathered from hardware devices in order to accumulate the most useful information for hardware performance determination.
2. Description of the Related Art
Systems management is the general area of information technology that concerns configuring and managing computer resources, including network resources. This includes gathering requirements, purchasing equipment and software, distributing it to where it is to be used, configuring it, maintaining it with enhancement and service updates, setting up problem-handling processes, and determining whether objectives are being met. In one embodiment, network management and database management are viewed as part of systems management or as co-equal parts of a total information system.
Monitoring operation and performance of devices in the computer system is an important aspect of the systems management. In general, the goal of monitoring performance is to ensure that the devices are properly performing. Modern wide area network computer systems consist of a vast plurality of interconnected devices, including host computer systems, network switches, storage devices, etc. Administering such computer systems is complex and generally requires managing each of the hardware devices in the network. Typical metrics that are monitored for the hardware devices include health status, device performance, device configuration, capacity data, etc.
FIG. 1 is a block diagram (100) of a computer system that employs a data collector to gather device performance data. More specifically, multiple data providers (102), (104), (106), and (108) are shown in communication with a data collector (120). The data providers (102)-(108) communicate performance data to the data collector (120), which is stored in data storage (130). Communication can be either via a push model in which the data providers (102)-(108) send data to the data collector (120) at a particular frequency, or via a pull model in which the data collector (120) polls the data providers (102)-(108) for data at a particular frequency. Regardless of the communication mechanism, the data collector (120) obtains data from the data providers (102)-(108) at a preset static frequency and stores the data persistently in the data storage (130).
Data collectors, such as that described in FIG. 1, do not efficiently gather data from the devices in the system. For example, the typical architecture for monitoring hardware devices in the system gathers data at predefined static frequencies. Static thresholds cannot detect all abnormalities in the system. If the frequency is set too high to ensure that data is gathered at a sufficient granularity, this places strain on the data providers and the data storage. In addition, this places a strain on a coordinator employed to evaluate the gathered data as there will be an abundant quantity of data for evaluation. Conversely, if the frequency is set too low to avoid the performance strains of a high frequency, there is an increased chance that a problem may go undetected for a prolonged period. Additionally, the coordinator may not be able to properly diagnose a problem in the system as there may not be a sufficient amount of gathered device data for such an evaluation. Accordingly, to fully exploit efficient management of disparate hardware devices in the system is challenging.
Accordingly, there is a need to employ a systems management application that supports dynamically changing the frequency in which data is communicated from the hardware devices to the data collector(s). Each hardware device should be monitored and evaluated based upon its performance and capabilities. Such a dynamical application supports efficient evaluation of heterogeneous hardware devices, thereby improving overall systems management.