Explosive growth and widespread acceptance of computer networks has been a primary driver of productivity gains in the World economy. Many computer programs no longer operate on single, stand-alone computers. Instead, the software program itself is partitioned among one or more server machines and many clients.
Sophisticated distributed-computing models and software applications are becoming available. These applications use an object-oriented or component model, with larger applications being divided into small containers or "objects" of program code and data. The program objects are distributed to both the server and clients, with the details of network communication hidden from objects through the use of proxy and stub objects which transfer information over the network. Microsoft's Distributed Component Object Model (DCOM) and Object Management Group's Common Object Request Broker Architecture (CORBA) are two competing standards for distributed computing. A basic overview of distributed computing is given by Larry Seltzer in "Future Distributed Computing" and "PC Size, Mainframe Power", PC Magazine Mar. 25, 1997, pages 198-204.
Unfortunately, computer hardware and software still fail, perhaps leaving parts of the distributed program waiting for responses from remote objects on crashed servers. FIG. 1 illustrates a problem when a server crashes, leaving distributed client objects hanging.
A distributed application includes client object 10 running on a client machine, and server object 14 running on a server machine. Proxy 12 is a middleware object that facilitates communication between client object 10 and server object 14, effectively hiding much of the complexity of the network protocols and overhead.
When the server machine or its network connection crashes, client object 10 is no longer able to communicate with server object 14. A well-written client object 10 notices that an abnormally long period of time has elapsed with no response, and begins a re-start sequence. Client object 10 is shut down, and a new instance of the client object 10' is loaded and initialized. A new proxy 12' is also started. Another server is located, and a new instance of server object 14' is loaded on the new server and initialized.
Re-loading and initializing client object 10' and server object 14' has the undesirable side effect that the former state of client object 10 is lost. Client object 10 had already sequenced from initial state A through state B to state C when the server crashed. Server object 14 likewise had advanced from its initial state X to state Y. These states were lost when the objects were re-loaded. Client object 10' is initialized back to initial state A, and new server object 14' is initialized to its initial state X.
Users then have to repeat whatever steps they had previously performed, essentially losing some or all of their work. Users could have navigated several levels of forms and entered information that was lost. Server crashes are truly one of the great aggravations of the information age.
FIG. 2 highlights proxies used to communicate between client and server objects. A client object 10 communicates with a local object known as a proxy. Proxies 12 make a connection over the network to the server machine and create a session with server object 14. Thus proxies 12 contain connection and session information.
While proxies are effective at hiding the details and complexities of network communication from program objects, they do not hide server crashes from the client objects. When the server crashes, the proxies become invalid since proxies do not handle server crashes. Thus server crashes are not hidden from the client objects as are many other details of network communication.
What is desired is a distributed-object application that hides server crashes from client objects. It is desired to have proxies detect server failures and establish a session with a different server. It is desired to initialize the replacement server object to the last state of the crashed server object so that the client object can continue operation. It is desired to avoid resetting and re-loading client objects when a server crashes.