1. Field of the Invention
The present invention relates to providing fault-tolerance in computer systems. More specifically, the present invention relates to a method and an apparatus that provides multiple-version support for highly available objects.
2. Related Art
As computer networks are increasingly used to link computer systems together, distributed operating systems have been developed to control interactions between computer systems across a computer network. Some distributed operating systems allow client computer systems to access resources on server computer systems. For example, a client computer system may be able to access information contained in a database on a server computer system. When the server fails, it is desirable for the distributed operating system to automatically recover from this failure. Distributed computer systems with distributed operating systems possessing an ability to recover from such server failures are referred to as “highly available systems.” Objects stored on such highly available systems are referred to as “highly available objects.”
For a highly available system to function properly, the highly available system must be able to detect a server failure and to reconfigure itself so accesses to objects on the failed server are redirected to backup copies on other servers. This process of switching over to a backup copy on another server is referred to as a “failover.”
FIG. 1 illustrates a system that supports highly available objects in accordance with an embodiment of the present invention. This system includes computational nodes 102, 104, 106, and 108. During operation, client 110 on node 102 sends invocation 126 to node 104 to operate on primary object 112.
In response to this invocation, a number of checkpointing operations take place. In particular, primary object 112 sends checkpoint request 130 to checkpoint object 114. Checkpoint object 114 generates checkpoint 132, which feeds into checkpoint handler 116. Checkpoint handler 116 adds information, such as a serial number, to checkpoint 132 and then passes checkpoints 134 and 136 to nodes 106 and 108, respectively. After checkpoints 134 and 136 have been delivered to nodes 106 and 108, primary object 112 sends reply 128 to client 110.
Upon receiving checkpoint 134, checkpoint object 118 within node 106 ensures correct ordering of checkpoints and then passes checkpoint 138 to secondary object 120. Similarly, upon receiving checkpoint 136, checkpoint object 122 within node 108 ensures correct ordering of checkpoints and then passes checkpoint 140 to secondary object 124.
At some time in the future, if node 104 fails, the system selects either secondary object 120 or 124 to be promoted to a primary object. Client 110 then completes any outstanding operations using the newly promoted primary object.
Software running on the various computers in a cluster is often updated to correct problems in the software and/or to add new features. However, it is not a simple matter to update software in a highly available clustered computing system without halting the entire system for a significant period of time. Note that it is possible for individual nodes in a cluster to be temporarily halted to load updated software without bringing the entire system down. However, if some nodes are running the updated software and other nodes are not, there can be incompatibilities between different versions of the software that facilitate highly available objects.
What is needed is a method and an apparatus that allows software to be updated within a cluster without halting the entire system and without incompatibility problems between different versions of the software.