In some computer systems, it is important to maximize the availability of critical services and applications. Generally, this is achieved by using a fault tolerant system or by using high availability (“HA”) software, which is implemented on a cluster of multiple nodes. Both types of systems are described briefly in “A High-Availability Cluster for Linux,” Phil Lewis (May 2, 2000).
A fault tolerant computer system includes duplicate hardware and software. For example, a fault tolerant server may have redundant power supplies, storage devices, fans, network interface cards, and so on. When one or more of these components fails, the fault is detected, and a redundant component takes over to correct the problem. In many cases, fault tolerant systems are able to provide failure recovery, which is nearly seamless (i.e., unperceivable to system users). However, because these systems rely on duplicate hardware, they tend to be expensive. In addition, these systems typically are proprietary, and are tightly coupled to the operating system, whatever that system may be.
HA software also provides fault detection and correction procedures. Generally, an HA software implementation is loosely coupled to the operating system, and therefore may be more portable to different types of systems and nodes than a fault tolerant system would be.
In contrast to fault tolerant systems, HA software is implemented on two or more nodes, which are arranged in a “cluster” and communicate over a link (e.g., a network). Typically, one node operates as the “master” for a particular application, where the master is responsible for executing the application. One or more other nodes within the cluster are “slaves” for that application, where each slave is available to take over the application from a failed master, if necessary.
The “Time Synchronization Protocol” (TSP) is an example of such an HA protocol, which is used by the clock synchronization programs timed and TEMPO. TSP is described in detail in “The Berkeley UNIX Time Synchronization Protocol,” Gusella, et al. (1986). TSP supports messages for the election that occurs among slaves when, for any reason, the master disappears, as is described in detail in “An Election Algorithm for a Distributed Clock Synchronization Program,” Gusella et al. (December 1985).
Often, applications are defined on certain nodes for a reason. For example, in a simple routing application, each node in a cluster may have preferred routes to various network segments, even though a node can get to any segment using less desirable, alternate routes. Each node is a master for a single routing application and is a slave for other nodes' routing applications. Accordingly, if a first node or it's routing application is down (e.g., the node or the application failed), a second node can take over and become master of the first node's routing application. While the first node or its routing application is down, the data streams that would otherwise have passed through the node are sent using a different route, which may be less optimal. When the first node and/or it's routing application comes back up, it is desirable to have the first node's routing application re-acquire it's status as master as quickly as possible, in order to restore the optimal route.
Similarly, it may be desirable for a master to fail over to a specific slave, rather than allowing a slave arbitrarily to be promoted to master through the election process. For example, using the routing example given above, when a first node is being taken down, it may be advantageous to resign the node's routing application to a second node that can still provide efficient routing for network traffic.
One disadvantage to current HA systems is that, once a node has relinquished its status as master for an application instance (e.g., through a node failure or resignation), the original master cannot later recover its status as master without difficulty. Another disadvantage is that current HA systems provide no way of failing over or resigning to a particular slave.
Accordingly, what are needed are a protocol, apparatus, and method that enable a node to efficiently take over as master. Further needed are a protocol, apparatus, and method that enable a master to fail over or resign to a particular slave, when desired.