Computer clusters are an increasingly popular alternative to more traditional computer architectures. A computer cluster is a collection of individual computers (known as nodes) that are interconnected to provide a single computing system. The use of a collection of nodes has a number of advantages over more traditional computer architectures. One easily appreciated advantage is the fact that nodes within a computer cluster may fail individually. As a result, in the event of a node failure, the majority of nodes within a computer cluster may survive in an operational state. This has made the use of computer clusters especially popular in environments where continuous availability is required.
Single system image (SSI) clusters are a special type of computer cluster. SSI clusters are configured to provide programs (and programmer's) with a unified environment in which the individual nodes cooperate to present a single computer system. Resources, such as filesystems, are made transparently available to all of the nodes included in an SSI cluster. As a result, programs in SSI clusters are provided with the same execution environment regardless of their physical location within the computer cluster. SSI clusters increase the effectiveness of computer clusters by allowing programs (and programmers) to ignore many of the details of cluster operation. Compared to other types of computer clusters, SSI clusters offer superior scaleablity (the ability to incrementally increase the power of the computing system), and manageability (the ability to easily configure and control the computing system). At the same time, SSI clusters retain the high availability of more traditional computer cluster types.
As the size of a computer cluster increases, so does the chance for failure among the cluster's nodes. Failure of a node has several undesirable effects. One easily appreciated effect is the performance degradation that results when the work previously performed by a failed node is redistributed to surviving nodes. Another undesirable effect is the potential loss of a resource, such as a filesystem, that is associated with a failed node.
Node loss can be especially serious in SSI clusters. This follows because resources are transparently shared within SSI clusters. Sharing of resources means that a single resource may be used by a large number of processes spread throughout an SSI cluster. If node failure causes the resource to become unavailable, each of these processes may be negatively impacted. Thus, a single node failure may impact many processes. Resource sharing also increases the likelihood that a process will access resources located on a number of different nodes. In so doing, the process becomes vulnerable to the failure of any of these nodes.
To ensure reliability, SSI clusters employ a number of different techniques. Failover is one of these techniques. To provide failover for a resource, the resource is associated with at least two nodes. The first of these nodes provides access to the resource during normal operation of the SSI cluster. The second node functions as a backup and provides access to the resource in the event that the first node fails. Failover, when properly implemented, greatly reduces the vulnerability of an SSI cluster to node failure.
In SSI clusters, filesystems are one of the most commonly shared resources. Thus, filesystem failover is especially important to the reliable operation of SSI clusters. Unfortunately, proper implementation of filesystem failover is a difficult task. This is particularly true in cases where filesystem performance is also a key consideration. For example, to increase performance of a shared filesystem, it is often necessary to aggressively cache the filesystem at each node where the filesystem is used. In cases where the filesystem fails over, it is imperative to maintain the consistency of the filesystem. Maintaining consistency during failover becomes increasingly problematic as caching becomes more aggressive. Thus, there is a need for techniques that balance the need to achieve high-performance filesystem operation and the need to provide failover protection.