1. Field of the Invention
The present invention relates to distributed data systems, and more particularly to managing dynamic cluster membership in distributed data systems.
2. Description of Related Art
In distributed data systems, data may be stored in several locations. Such locations may include servers, computers, or other devices with storage devices or access to storage devices. Storage devices may include hard drives, memory, registers, and other media where data can be stored and retrieved. A distributed data system may span a large network or combination of networks, for example on the Internet or a local intranet, or simply involve a plurality of storage devices connected to a computing device. The data may be distributed in blocks of specific sizes, by file, or any fashion according with space constraints of available storage devices.
Cooperating members of a distributed data system may form clusters to provide transparent data access and data locality for clients, abstracting from the clients the possible complexity of the data distribution. FIG. 1 illustrates a distributed data system of nodes 110 forming a cluster 100, each including storage space for distributed data 111. Other nodes may exist that are not part of the cluster. Data for any clients of the cluster nodes 110 may be distributed in the data stores 111 of the cluster nodes 110. Nodes may be servers, computers, or other computing devices. Nodes may also be computing processes, so that multiple nodes may exist on the same server, computer, or other computing device. Communication between nodes forming a cluster may be possible over some connections, for example electrical coupling or wireless connections.
Clustering of nodes may enable load balancing, high availability, and scalability to support client requirements or improve performance. In the event of failure, for example, data backup at multiple locations in a cluster may provide high availability so that data is not lost. Different nodes may be able to provide data or take over tasks for each other. Maintaining high availability generally may involve multiple nodes maintaining redundant data. Redundant data may be maintained by replicating data between nodes, for example between multiple processes of the same or different server, by replicating the data on different servers, or generally by ensuring duplicate data exists in different actual or virtual locations.
Clusters may also be used to address problems of data ownership and data consistency when failures occur in a cluster. A dynamic cluster involves changes in the membership of the cluster over time. Such changes may occur as a result of failures and dynamic cluster membership management involves tracking the membership of a cluster. Failure events may include node failures in a network, unresponsive nodes or processes, process failures, events preventing a node from operating in a cluster, or other events that can lead to a non-functioning cluster. Changes in the cluster may occur when members rejoin or new members join the cluster, affecting the relationship between cluster participants.
One solution for dynamic cluster membership is a centralized master and slave topology, for example as a star topology. However, using a single centralized master and multiple slaves, essentially a star topology with the central node acting as the master, may create a bottleneck. Such a topology may negatively impact scalability, and frequent data updates between master and slaves may result in lower performance. The ability of slaves to get membership information about each other may be limited. The failure of the central node itself may spawn complex computing problems, particularly in the event of multiple node failures. Addressing node failure may include implementing leader elections by remaining nodes, for example.
Topology management may be needed whatever the topology of the distributed system, for example to handle nodes entering or exiting the cluster. Changes in cluster membership or topology may affect access to the distributed data stored in the distributed system. Typically, cluster membership management is handled as an integral part of the distributed data management since membership changes may affect distributed data access and distributed data access may vary depending on topology.