A server cluster is a group of at least two independent servers connected by a network and managed as a single system. The clustering of servers provides a number of benefits over independent servers. One important benefit is that cluster software, which is run on each of the servers in a cluster, automatically detects application failures or the failure of another server in the cluster. Upon detection of such failures, failed applications and the like can be quickly restarted on a surviving server, with no substantial reduction in service. Indeed, clients of a Windows NT cluster believe they are connecting with a physical system, but are actually connecting to a service which may be provided by one of several systems. To this end, clients create a TCP/IP session with a service in the cluster using a known IP address. This address appears to the cluster software as a resource in the same group (i.e., a collection of resources managed as a single unit) as the application providing the service. In the event of a failure the cluster service "moves" the entire group to another system.
Other benefits include the ability for administrators to inspect the status of cluster resources, and accordingly balance workloads among different servers in the cluster to improve performance. Dynamic load balancing is also available. Such manageability also provides administrators with the ability to update one server in a cluster without taking important data and applications offline. As can be appreciated, server clusters are used in critical database management, file and intranet data sharing, messaging, general business applications and the like.
A cluster works with a large number of basic system components, known as resource objects, which provide some service to clients in a client/server environment or to other components within the system. Resource objects range from physical devices, such as disks, to purely software constructs, such as processes, databases, and IP addresses.
As can be appreciated, these resource objects are rather disparate in nature. Notwithstanding, the cluster software on each system needs to control and monitor the operation of the resource objects on its systems, regardless of their type. For example, the Windows NT Cluster design provides failure detectors and recovery mechanisms for working with a system's resources. However, because of the widely disparate types of resource objects, the software for resource monitoring heretofore needed to be highly complex so that the cluster was able deal with each type of resource object it was controlling.