In computer storage, logical volume management is a flexible method of allocating space on mass-storage devices. In particular, a volume manager can concatenate, stripe together or otherwise combine underlying physical partitions into larger, virtual ones. An administrator can then re-size or move logical volumes, potentially without interrupting system use.
A cluster is a group of computers (nodes) that uses groups of redundant computing resources in order to provide continued service when individual system components fail. More specifically, clusters eliminate single points of failure and provide parallel access to shared resources by having multiple servers, multiple network connections, redundant data storage, etc.
A cluster volume manager extends volume management across the multiple nodes of a cluster, such that each node recognizes the same logical volume layout, and the same state of all volume resources at all nodes. Under cluster volume management, any changes made to volume configuration from any node in the cluster are recognized by all the nodes of the cluster.
Many cluster volume management protocols are master-slave in nature. For example, in an asymmetric storage configuration, commands to change the shared storage configuration are sent to the master node. The master node executes changes to the shared storage configuration, and propagates the changes to the slave nodes in the cluster.
When the master node leaves the cluster, another node in the cluster is selected to become the new master. Conventionally, the logic used to select the new master node is typically simplistic (e.g., select the next node ID, select the node with the lowest ID, etc.) and sometimes static (e.g., administrator manually defines node to become next master, etc.). However, selecting an appropriate master is critical, and a poor selection may lead to a number of serious problems.
If full or partial connectivity to underlying storage is lost from the new master node, the new master would be unable to serve any master-slave protocol that depends upon the access to storage from the master node. Thus, all such protocols would be disabled cluster-wide. Additionally, if the computational resources (e.g., central processing unit (“CPU”) capacity, network bandwidth, storage bandwidth, etc.) on the new master node are not sufficient, performance issues may result affecting all dependent master-slave protocols. Furthermore, nodes with certain characteristics should not be selected as master nodes in the first place. Some examples include a node that is not centrally located geographically (e.g., a node in an off-shore data center as opposed to corporate headquarters), a node with constraints regarding which applications it can serve, and a node specifically designated by an administrator as not being suitable for use as a master (based on, e.g., experience or subjective knowledge). In a conventional selection methodology in which a new master node is selected based on, e.g., node ID, a node with any of these properties could be selected as the master.
It would be desirable to address these issues.