A cluster is a collection of one or more complete systems, having associated processes, that work together to provide a single, unified computing capability. The perspective from the end user, such as a business, is that the cluster operates as through it were a single system. Work can be distributed across multiple systems within the cluster. Any single outage, whether planned or unplanned, in the cluster will not disrupt the services provided to the end user. That is, end user services can be relocated from system to system within the cluster in a relatively transparent fashion.
Generally, before taking advantage of the dependable accessibility benefits permitted by clustering technology, a cluster requires configuration, a task possibly undertaken by a system administrator. Configuring a cluster requires a determination of what are the cluster member configurations and on which nodes, i.e., application server middleware installations, these configurations reside. As a result, in many systems, cluster configuration is basically static, and, even presuming that the static configuration is error free, which may not be true, configuring still requires a system administrator to outlay significant time and planning. This time and planning is costly, and even more costly if errors exist requiring even more time and planning to correct an error in cluster configuration.
Clusters may also be used to address problems of data ownership and data consistency when failures occur in a cluster. A dynamic cluster involves changes in the membership of the cluster over time. Such changes may occur as a result of failures and dynamic cluster membership management involves tracking the membership of a cluster. Failure events may include node failures in a network, unresponsive nodes or processes, process failures, events preventing a node from operating in a cluster, or other events that can lead to a non-functioning cluster. Changes in the cluster may occur when members rejoin or new members join the cluster, affecting the relationship between cluster participants.
One solution for dynamic cluster membership is a centralized master and slave topology, for example a star topology. However, using a single centralized master and multiple slaves, i.e. a star topology with the central node acting as the master, may create a bottleneck. Such at topology may negatively impact scalability, and frequent data updates between master and slaves may result in lower performance. The ability of slaves to get membership information about each other may be limited. The failure of the central node itself may spawn complex computing problems, particularly in the event of multiple node failures. Addressing node failure may include implementing leader elections by remaining nodes, for example.