The invention relates in general to distributed computer systems and more specifically to an apparatus, system, and method for managing a distributed computer system.
Distributed computer systems may include any number of networked devices such as computers, servers, and memory storage devices that are interconnected through a network. The network typically includes a combination of interconnected network devices such as hubs, switches, and routers. Many of the network devices and networked devices operate in accordance with a configuration that can be set and modified. The configuration is typically managed with the use of configuration objects that represent logical or virtual arrangements and relationships and define any number of structures, allocations, operational rules, priorities, preferences, or functions related to memory, data storage, bandwidth, communication paths, and communication protocols. A configuration object includes procedures and data that define the configuration of at least a portion of the system, where the data includes configuration parameters that represent settings or other stored values pertaining to individual devices. Configuration parameters, therefore, may include settings, addresses, names, identifiers, pathnames, operational minimums and maximums, bandwidths, time limits and other values. By setting and managing the configuration objects, management tasks can be performed. An example of a management task includes establishing an end-to-end path between a host system and a networked storage device that may include the configuration tasks of creating a storage volume, setting the access controls of the storage volume, setting the access controls of the network and configuring the host adapters. The configuration parameters are often chosen or otherwise established during an initial configuration of the system and are periodically adjusted for various reasons by modifying and managing the configuration objects. Typically, the configuration is established and adjusted in response to changes in the needs that the system must fulfill, to meet such purposes as allocating and providing access to resources such as storage or network capacity, for protecting such resources, and to maximize the performance and efficiency of the individual devices and the system as a whole. Often, the distributed computer system is managed by an administrator that directly interfaces with some all of the system devices to set or change the configuration. As the size and complexity of distributed computer systems increase, the management of the system also increases in complexity making the responsibility of maintaining and managing the system extremely burdensome.
An approach for dealing with the growing management problem involves automating management procedures using system management software. This approach, however, is limited in that administrators are reluctant to relinquish control to automated management procedures for several reasons. Often, administrators perceive a risk that the automated procedure may cause undesirable results that can not be rectified. Although some of the perceived risks may be less reasonable than others, many of the concerns are warranted. Since software is not completely reliable, actual damage to the system may occur. Data can be lost or performance degraded and applications running concurrently with the system management software may be adversely effected. The applications may be critical to the proper operation of the system and any failure may result in substantial financial loss. Further, even if the administrator trusts the system management software to properly execute the individual automated management tasks of the automated procedure, the administrator may prefer to control the timing of the execution in order to apply additional preferences based on an overall understanding of system operation and configuration objectives. For example, certain management tasks may be better suited for executions during times when the system resources are less taxed by other applications running on the system. The administrator may be sensitive to system operations, configurations, and configuration changes that are not considered by the automated procedure. With conventional systems, the administrator must relinquish all control to the automated configuration procedure or manually perform the configuration.
Accordingly, there is need for an apparatus, system and method for autonomy-based management of a distributed computer system.