Client-server applications, including hosted software applications, provided as a services offering need to have a reasonably high availability in order to supply an acceptable service level and prevent customer dissatisfaction due to outages. While a number of techniques have been developed to provide high availability, the cost is usually prohibitive from the standpoint of what a customer is willing to pay for the service offering.
For example, BEA Systems, Inc. describes in a document titled “Using WebLogic Server Clusters,” a group of servers that work together to provide a powerful, reliable application platform. A clustered service is one that is available on multiple servers in the cluster. The cluster appears to a client as a single server, but is in fact a group of servers acting as one. If one server fails, another can take over. The ability to fail-over from a failed server to a functioning server increases the availability of the application to a client.
The clustered service is represented by a stub i.e. a local procedure in a remote procedure call (RPC). The stub is aware of all instances of service. The stub appears to the client as a normal remote method invocation (RMI) stub. On each call, the stub employs a load algorithm to choose which instance to call, providing load balancing across the cluster. If a failure occurs during the call, the stub intercepts execution and retries the call on another instance, providing fail-over to the client.
Lidcam Technology of Melbourne Australia describes in a 2001 document titled “ServerIron Internet Web Switches,” detection and sub-second fail-over to the next server in a group that provides like service. Their ServerIron switch detects application error conditions such as the hypertext transfer protocol (HTTP) “404 Object not found” before the client sees the message and transparently redirects the request to another server without any manual intervention. To provide very high availability, the ServerIron switch includes redundancy capability that protects against session loss.
Goldszmidt et al. In U.S. Pat. No. 6,195,680 B1 describe a client-server system for fault tolerant delivering a data stream such as live audio or video clips. The client receives the real-time data stream from a primary server in a first set of servers. Upon detecting a failure in either the real-time data stream or the primary server, the client dynamically switches to receiving the real-time data stream from a secondary server in a set of servers disjoint from the first set of servers.
Guenthner et al. In U.S. Pat. No. 6,134,588 describe changing a web browser so that the browser will address its requests to any (or policy-specified) of the available servers in a plurality of servers. If a browser time-out failure occurs, indicating unavailability of a server, the browser selects another server. The browser remembers for a given time period which server IP addresses have failed so that those addresses are not repeatedly tried.
U.S. Pat. Nos. 6,195,680 and 5,134,588 as well as the Lidcam “ServerIron” paper described above shall be incorporated herein by reference.
Despite the aforementioned and other developments, no overall satisfactory solution has been found which provides quasi-high availability of client-server applications including hosted services applications, with low cost.
In accordance with the teaching of the present invention there is provided such a solution. It is believed that this overall solution constitutes a significant advancement in the hosted services application offering art.