In the early days of computing, computer systems were stand-alone devices accessed by computer users via input/output (“I/O”) peripheral components, including control-panel toggle switches, Hollerith-card readers, line printers, and eventually cathode-ray-tube (“CRT”) 24-line terminals and keyboards. When a user wished to carry out a computational task on more than one computer system, the user would manually transfer data between the computer systems via Hollerith cards, magnetic tape, and, later, removable magnetic-disk packs.
With the advent of multi-tasking operating systems, computer scientists discovered and addressed the need for synchronizing access by multiple, concurrently executing tasks to individual resources, including peripheral devices, memory, and other resources, and developed tools for synchronizing and coordinating concurrent computation of decomposable problems by independent, concurrently executing processes. With the advent of computer networking, formerly independent computer systems were able to be electronically interconnected, allowing computer systems to be linked together to form distributed computer systems. Although initial distributed computer systems were relatively loosely coupled, far more complex, tightly coupled distributed computer systems based on distributed operating systems and efficient, distributed computation models, have since been developed.
There are many different models for, and types of, distributed computing. In some models, relatively independent, asynchronous, peer computational entities execute relatively autonomously on one or more distributed computer systems, with sufficient coordination to produce reliable, deterministic solutions to computational problems and deterministic behavior with respect to external inputs. In other distributed systems, tightly controlled computational entities execute according to pre-determined schedules on distributed computer systems, closely synchronized by various protocols and computational tools. In many fault-tolerant and highly available distributed computer systems, computational tasks are distributed among individual nodes, or computers, of the distributed computer system in order to fairly distribute the computational load across the nodes. In the event of failure of one or more nodes, surviving nodes can assume, or be assigned, tasks originally distributed to failed nodes so that the overall distributed computational system is robust and resilient with respect to individual node failure. However, even in distributed systems of relatively independent peer nodes, it is frequently the case that, for certain tasks, a single node needs to be chosen to be responsible for the task, rather than simply allowing any of the peer nodes to contend for the task, or for subtasks that together compose the task. In other words, a single node is assigned to be, or elected to be, the leader with respect one or more tasks that require investing responsibility for the one or more tasks in a single node. Tasks for which leaders need to be assigned are generally tasks that are not efficiently decomposed, iterative tasks with high, initial-iteration computational overheads, and tasks that require assembling complex sets of privileges and controls over resources. Examples of such tasks include coordinator-type tasks in which a single node needs to be responsible for distributed state changes related to distributed-system management, distributed-system-updating tasks, including installation of software or software updates on nodes within the distributed system, system-state-reporting tasks, in which a single node needs be responsible for accessing and reporting the distributed state of a distributed computer system, and, in certain systems, scheduling, distribution, and control tasks for the distributed system.
A leadership-role allocation can be hard wired, or statically assigned at distributed-system initialization, for all, a subset of, or individual tasks needing a leader. However, relatively static leader assignment may lead to time-consuming and difficult leader-reassignment problems when a leader node fails or becomes incapable of carrying out those tasks required of the leader node. Alternatively, all nodes can constantly contend for leader roles for tasks requiring a leader on an on-demand basis, but constant leader-role contention may be inefficient and may even lead to thrashing. Strong-leader self-election based on a distributed consensus service is a useful model for certain of these distributed computer systems and distributed computing tasks. The strong-leader-election method based on a distributed consensus service can be extended to provide strong-leader election for multiple roles within a distributed computer system. However, in more complex distributed computer systems, leadership may need to be allocated for multiple roles on a continuing basis, and leadership may need to be distributed among individual processes running on nodes within a distributed computer system. For these environments, researchers, developers, manufacturers, and users of distributed computer systems have recognized the need for a practical and efficient means for continuous, dynamic allocation of leadership among processes within nodes of a multi-node distributed computer system.