A cluster includes multiple nodes (i.e., computer systems) that interact with each other to provide users with various applications and system resources as a single entity. Clustering of computer systems is becoming an increasingly popular way for enterprises and large businesses to ensure greater availability to multiple users. Additionally, the cluster is capable of providing scalability. Thus, more services can be provided by adding more nodes to the cluster.
A cluster system is able to provide improved availability because of the ability to survive single node failures. In addition, depending upon the configuration, the cluster may be able to survive multiple node failures without the loss of services to users. In other words, the cluster may continue to provide services despite the failure of a single node within the cluster. Typically, upon failure of one or more nodes, the services provided by the node that failed are dispersed among the other running nodes to allow continued service to users.
The cluster typically includes a file system to manage data storage within the cluster. The cluster file system contains one entity representing the overall state of the cluster file system and an entity for each of the active files that exist within the cluster file system. The cluster file system allows concurrent use of files by multiple nodes. Additionally, the cluster file system uses additional components to support each active file within the cluster file system. One such component is a file proxy component (or file agent) that is located on nodes in the cluster where one or more files are being used. Conventionally, there is one file proxy for each active file associated with a node in the cluster file system. The file proxy accepts file operation requests and returns file operation results. Another component of the cluster is a file server primary. Similar to the file proxy, there is one file server primary for each active file in the cluster file system, and the file server primary communicates with the file agent for the same active file. Each file server primary provides an interface allowing access to data storage (or may directly use storage) and is capable of satisfying file operation requests.
Conventionally, in order to recover from a failed node (i.e., the node that includes the file server primaries, hereinafter referred to as the “primary node”) within the cluster system, each file server primary sends information using checkpoints to at least one file server secondary, which is always located on a different node than the primary node. The checkpoints effectively replicate information on multiple nodes, thereby allowing a file server secondary to be promoted to file server primary in the event of the failure of the primary node. At a minimum, the checkpoint information requires that enough information be given to each file server secondary to allow the file server secondary (as the newly promoted file server primary) to complete file operations without errors being visible to the users of the file system.
Checkpoints are a synchronous form of communication. For example, after the checkpoint information associated with a file operation is sent, the file server primary that sent the information waits for acknowledgement from each file server secondary indicating that the information from the first file operation request was received. Additionally, the checkpoint information is recorded in the file server secondary before a file is modified with the change. This inter-machine communication during checkpointing (where the checkpoints are associated with each file operation) may end up being more costly than the file operation being checkpointed.