The present invention relates to a method and means of indicating and determining the availability of a multitude of application servers providing application services to a multitude of application clients.
Enterprises depend on the availability of the systems supporting their day to day operation. A system is called available if it is up and running and is producing correct results. In a narrow sense availability of a system is the fraction of time it is available. MTBF denotes the mean time before failure of such a system, i.e. the average time a system is available before a failure occurs (this is the reliability of the system). MTTR denotes its mean time to repair, i.e. the average time it takes to repair the system after a failure (this is the downtime of the system because of the failure). Then, AVAIL=MTBF/(MTTR+MTBF) is the availability of the system. Ideally, the availability of a system is 1. Today, a system can claim high availability if its availability is about 99.999% (it is called fault tolerant if its availability is about 99.99% ). J. Gray and A. Reuter, “Transaction processing: Concepts and Techniques”, San Mateo, Calif.: Morgan Kaufmann 1993 give further details on these aspects. Availability of a certain system or application has at least two aspects: in a first, narrow significance it relates to the question, whether a certain system is active at all providing its services; in a second, wider significance it relates to the question, whether this service is provided in a timely fashion offering a sufficient responsiveness.
One fundamental mechanism to improve availability is based on “redundancy”:
The availability of hardware is improved by building clusters of machines and the availability of software is improved by running the same software in multiple address spaces.
With the advent of distributed systems, techniques have been invented which use two or more address spaces on different machines running the same software to improve availability (often called activereplication). Further details on these aspects may be found in S. Mullender, “Distributed Systems”, ACM Press, 1993. In using two or more address spaces on the same machine running the same software which gets its request from a shared input queue the technique of warm backups is generalized by the hotpool technique.
C. R. Gehr et al., “Dynamic Server Switching for Maximum Server Availability and Load Balancing”, U.S. Pat. No. 5,828,847, which is hereby incorporated herein by reference, teaches a dynamic server switching system related to the narrow significance of availability as defined above. The dynamic server switching system maintains a static and predefined list (a kind of profile) in each client which identifies the primary server for that client and the preferred communication method as well as a hierarchy of successively secondary servers and communication method pairs. In the event that the client does not have requests served by the designated primary server or the designated communication method, the system traverses the list to obtain the identity of the first available alternate server-communication method pair. This system enables a client to redirect requests from an unresponsive server to a predefined alternate server. In this manner, the system provides a reactive server switching for service availability.
In spite of improvements of availability in the narrow sense defined above this teaching suffers from several shortcomings. Gehr's teaching provides a reactive response only in case a primary server could not be reached at all. There are no proactive elements which already prevent that a client requests service from a non-responsive server. As the list of primary and alternate servers is statically predefined there may be situations in which no server could be found at all or in which a server is found not before several non-responsive alternate servers have been tested. In a highly dynamic, worldwide operating network situation where clients and servers permanently enter or leave the network and where the access pattern to the servers may change from one moment to the next, Gehr's teaching is not adequate.
The European Patent application EP 99109926.8 titled “Improved Availability in Clustered Application Servers” by the same inventors as the current invention is also related to the availability problem and any U.S. Patent based on this EP Application is hereby incorporated herein by reference. But this teaching is solely focused on the side of the application client. To make sure that a certain application request is being processed by an available application server it is suggested to send this application requests in a multi-casting step to a multitude of application servers in parallel assuming that at least one available application server will receive this request. This teaching is completely mute on techniques of how to indicate availability of a certain application server.
From the same inventors a further European Patent application EP 99122914.7 titled “Improving Availability and Scalability in Clustered Application Servers” is known and any U.S. Patent based on this EP Application is hereby incorporated herein by reference. In this application the existence of a technique to determine availability of an application server is already assumed as a starting point. This teaching is then focusing on a technique of how an application client can perform workload balancing by selecting a certain application server to process an application request.
Despite the progress thus far, further improvements are urgently required supporting enterprises in increasing the availability of their applications and allowing for instance for electronic business on a 7 (days)*24 (hour) basis; due to the ubiquity of worldwide computer networks at any point in time somebody might have interest in accessing a certain application server.
The invention is based on the objective to provide an improved method and means for indicating availability of application servers to accept application requests and to provide an improved method and means for determining by an application client availability of an application server.
It is a further objective of the invention to increase the availability by providing a technology, which is highly responsive to dynamic changes of the availability of individual application servers within the network.