Highly available systems are fault tolerant systems with no single point of failure. Highly available services are typically provided by large and complex systems built from Commercial-Of-The-Shelf (COTS) components. Such systems are deployed on top of standardized middleware services that manage service availability by monitoring component health and by shifting workload from a faulty component to a healthy one.
The Service Availability Forum (SA Forum) is a consortium of industry-leading companies promoting a set of open specifications that enables the creation and deployment of highly available, mission critical services. As a standardization body, the SA Forum has defined a set of open specifications for middleware services including the Availability Management Framework (AMF) for supporting and managing service availability (see, SA Forum, Application Interface Specification, Availability Management Framework SAI-AIS-AMF-B. 04.01). Specifically, the AMF specification describes a middleware service, which is responsible for maintaining and managing the high availability of the services provided by applications. The AMF specification aims at reducing the application development time and cost by shifting the availability management from applications to this middleware service. This middleware service (referred to hereinafter as the AMF) manages the redundancy of the components of an application and dynamically assigns the workload to each component.
Researchers have developed various techniques for analyzing the availability of a highly available system. However, existing techniques do not target the availability analysis of AMF configurations in a generic context.
For example, a runtime system can be modeled with Markov chains and its availability can be analyzed based on data collected at runtime (D. Wang, K. S. Trivedi, “Modeling User-Perceived Perceived Service Availability” In the Proc. of Second International Service Availability Symposium, (ISAS) LNCS Vol. 3694, pp 107-122 Berlin, Germany, Apr. 25-26, 2005). Their work does not present a generic method for the availability analysis of AMF configurations; instead it defines a model for a particular runtime system. As a consequence of not defining a generic approach, their analysis cannot be reused in a generic context for evaluating the availability of the services in any AMF configuration.
There are other works that target the availability analysis in a more generic context and are specified in the Unified Modeling Language (UML) (see, e.g., A. Bondavalli, Majzik, Mura, “Automatic dependability analysis for supporting design decisions in UML,” 4th IEEE International Symposium on High-Assurance Systems Engineering, vol., no., pp. 64-71, 1999). Their work describes stochastic model that can be subsequently solved to quantify the expected availability. However, their work does not target AMF configurations, and the constructs of the model that they use to describe the system are not aligned with the constructs specified in AMF configurations.