A cluster is a set of nodes, each with a unique identifier within the cluster, connected together by a communication network. The membership of the cluster changes as nodes join and leave the cluster. The Cluster Membership Service allows a service application process to retrieve information about the nodes and the membership. It also allows a service application process to register to receive notifications of membership changes as they occur, using callback functions in order to provide failover service and thus High Availability services for networks such as computer systems or communication networks having interconnected communication nodes or servers. For example, the Service Availability™ Forum (SAF) specifications provide high availability service and requirements of service continuity for end-users. Achieving service continuity means maintaining customer data and session state without disruption across switchover or other fault-recovery scenarios. The reader interested in more information relating to the SAF middleware standard specification and HA applications is referred to SAF AIS B 03, which is available at www.saforum.org/specification.
In a SAF cluster, HA services are distributed across the entire cluster and are provided to HA applications in a transparent manner. Such HA clusters operate by having redundant computers or nodes that are used to provide continuous service when system components fail. The service should not be interrupted; therefore a security domain should guarantee security without involving any interruption in the service availability. A service domain is the set of service application processes that are grouped together to provide a service for example cluster membership, security service, messaging service, event service or any others in the cluster.
There are active and standby service application processes. Often in HA systems, the process life cycle and availability are monitored in order to keep the high availability. This functionality is provided by some Availability Management service (for example AMF defined in [SAF-AMF]). In this model, each process registers to the Availability Management Framework (AMF) with a defined component name (this component name can be communicated to the process by the system management, e.g. UNIX type environment variable). This component name represents the process functionality in the service domain. The component name can be presented as an LDAP name format.
In case of failure of active service application processes the service needs to be switched over to the standby service application processes. Standby service application processes receive the information from active processes (for example check points) in order to be ready to provide functionality after failure of active processes. Since, an active service application process needs to be authenticated prior to be able to access and provide services in a service domain. A standby service application process, which takes over for a failed active service application process also needs to be authenticated in the cluster. However, in existing models the authentication of the standby service application process may take a long time and thus causing delay in the service continuation. Using a central authentication server for authenticating users also take a long period of time in HA standards. Not much has been done to avoid time delays during authentication of a standby service application process during a take over of the standby service application process for an active process in distributed systems.