“Clustering” generally refers to a computer system organization where multiple computers, or nodes, are networked together to cooperatively perform computer tasks. An important aspect of a computer cluster is that all of the nodes in the cluster present a single system image—that is, from the perspective of a user, the nodes in a cluster appear collectively as a single computer, or entity.
Clustering is often used in relatively large multi-user computer systems where high performance and reliability are of concern. For example, clustering may be used to provide redundancy, or fault tolerance, so that, should any node in a cluster fail, the operations previously performed by that node will be handled by other nodes in the cluster. Clustering is also used to increase overall performance, since multiple nodes can often handle a larger number of tasks in parallel than a single computer otherwise could. Often, load balancing can also be used to ensure that tasks are distributed fairly among nodes to prevent individual nodes from becoming overloaded and therefore maximize overall system performance. One specific application of clustering, for example, is in providing multi-user access to a shared resource such as a database or a storage device, since multiple nodes can handle a comparatively large number of user access requests, and since the shared resource is typically still available to users even upon the failure of any given node in the cluster.
Clusters typically handle computer tasks through the performance of “jobs” or “processes” within individual nodes. In some instances, jobs being performed by different nodes cooperate with one another to handle a computer task. Such cooperative jobs are typically capable of communicating with one another, and are typically managed in a cluster using a logical entity known as a “group.” A group is typically assigned some form of identifier, and each job in the group is tagged with that identifier to indicate its membership in the group. Typically, these jobs, which are often referred to as “members”, are resident on different nodes in a cluster.
Member jobs in a group typically communicate with one another using an ordered message-based scheme, where the specific ordering of messages sent between group members is maintained so that every member sees messages sent by other members in the same order as every other member, thus ensuring synchronization between nodes. Requests for operations to be performed by the members of a group are often referred to as “protocols,” and it is typically through the use of one or more protocols that tasks are cooperatively performed by the members of a group.
Clusters often support changes in group membership through the use of group organizational operations such as membership change protocols, e.g., if a member job needs to be added to or removed from a group. One such change protocol is a join protocol, which is used to add a new member to a group. Among other operations, a join protocol ensures that group state or configuration information is sent to the joining member so that all members of the group have a consistent view of the state.
One type of group is a primary-backup group, in which one group member is designated as the primary, and the other members are backups. Primary-backup groups are often used in a clustered computer system to manage a type of resource, such as a disk, tape or other storage unit, a printer or other imaging device, or another type of switchable hardware component or system.
One particular application of a primary-backup group is for managing switched disks. In such a group, the disk is accessible from either the primary or backup members, but only the primary member actually hosts the disk. Members join the group to provide additional backup members for the switched disk being managed by the group, with the typical join protocol transmitting configuration information for the disk from the primary member to the joining member, as the protocol assumes the joiner is able to access the disk.
In the event of a failure in a primary member in a clustered computer system, management of the resource is automatically switched over to a backup member, typically according to a predetermined backup order. Access to the resource is therefore maintained despite the failure of the primary member.
While the use of the aforementioned primary-backup groups increases the fault tolerance of a clustered computer system due to the ability to automatically switch management responsibility to different members of the group, such groups are not capable of directly addressing failures in the managed resources themselves. For example, failure of a disk can render the disk (and more importantly, the data on that disk) unavailable to the clustered computer system.
Resources such as disks and other storage systems often rely on other techniques for providing fault tolerance, such as mirroring, where data stored on one disk (typically referred to as a primary or production disk) is mirrored or copied to another disk (typically referred to as a backup or copy disk). With mirroring therefore, a failure in the primary disk in such a system typically does not cause a loss of stored data, as the backup disk typically may be accessed in the alternative to supply any requested data.
Should a mirrored resource such as a mirrored disk be utilized in a clustered computer system, it would be desirable to utilize a group structure similar to a conventional primary-backup group to manage the operation of such a resource, preferably in a manner that ensures fault tolerance both from the perspective of the group managing the resource and the underlying resource itself. In a conventional primary-backup group, where a single primary member hosts the primary resource, the use of a mirrored resource would require that the primary, as well as all backup members capable of assuming management duties, have access, and be capable of hosting, both the primary resource and any backup resources.
However, clustered computer systems are increasingly being implemented using more flexible and dispersed environments. For example, some clustered computer systems permit geographically distant computers to participate in the same cluster.
Indeed, from the perspective of fault tolerance of a resource, it is theoretically more reliable for primary and backup resources to reside in different cluster nodes, so that any failures in a particular cluster node only affects a subset of any resources resident in such nodes. Given, however, the possibility that a cluster may be dispersed among many different locations, a requirement that each member in a primary-backup group be capable of accessing and/or hosting both primary and backup resources would be overly constrictive, as oftentimes the management operations that may be performed by a member of a primary-backup group requires in the least proximity between the member and the resource being managed, if not direct connectivity therebetween.
As such, a need exists for a primary-backup group architecture that supports the hosting of primary and backup resources irrespective of the connectivity and dispersion of the group members in a clustered computer system. More specifically, a need exists for a join protocol that supports the creation of such a primary-backup group architecture.