1. Field of the Invention
The present invention relates generally to distributed object operating systems, and more particularly to a system and method that supports transparent failover from a primary server to a secondary server during accesses to a remote object.
2. Related Art
As computer networks are increasingly used to link computer systems together, distributed operating systems have been developed to control interactions between computer systems across a computer network. Some distributed operating systems allow client computer systems to access resources on server computer systems. For example, a client computer system may be able to access information contained in a database on a server computer system. When the server fails, it is desirable for the distributed operating system to automatically recover from this failure. Distributed computer systems with distributed operating systems possessing an ability to recover from such server failures are referred to as "highly available systems." Data objects stored on such highly available systems are referred to as "highly available data objects."
For a highly available system to function properly, the highly available system must be able to detect a server failure and to reconfigure itself so accesses to objects on the failed server are redirected to backup copies on other servers. This process of switching over to a backup copy on another server is referred to as a "failover."
Existing client-server systems typically rely on the client application program to explicitly detect and recover from server failures. For example, a client application program typically includes code that explicitly specifies timeout and retry procedures. This additional code makes client application programming more complex and tedious. It also makes client application programs particularly hard to test and debug due to the difficulty in systematically reproducing the myriad of possible asynchronous interactions between client and server computing systems. Furthermore, each client application program must provide such failover code for every access to a highly available object from a server.
Therefore, what is needed is a distributed-object operating system that recovers from server failures in a manner transparent to client application programs. Such a distributed system will allow client application programs to be written without the burden of providing and testing failure detection and retry code.