Distributed computing systems, especially those having parallel architectures, employ a plurality of processing elements that operate logically concurrently to perform a single task. The processing elements might be individual processors linked together in a network or a plurality of software instances operating concurrently in a coordinated environment. In a network configuration, the processors communicate with each other through a network which supports a network protocol. This protocol might be implemented using a combination of hardware and software components. In a coordinated software environment, the software instances are logically connected together through some communication medium such as an Ethernet network. The software instances are referred to, individually as members, and together as a group.
The processing elements typically communicate with each other by sending and receiving messages or packets through a common interface. A processing element typically makes a collective call at the common interface which is coordinated among a subset of the processing elements. The elements involved in the call are referred to as participants. In many applications, frequently used collective calls are provided in a common communication library such as the industry standard Message Passing Interface. Further details on this standard are described in "MPI: A Message-Passage Interface Standard," published by the University of Tennessee, 1994, (hereinafter, MPI). The communication library provides not only ease in parallel programming, debugging, and portability but also efficiency in the communication processes of the application programs.
One of the problems associated with a distributed system is that it is not possible to guarantee a consistency in failure detection by the processing elements so that each correctly functioning member is guaranteed an accurate and consistent view of the current membership. Thus, failure or tardiness of some processing elements may not be consistently detected and reported by the other elements. Inconsistency in these reports makes recovery from system failures difficult. The problem of keeping the views of the members in the membership accurate and consistent is called the membership problem. Members join a group because of events external to the cooperative task assigned to the group. Members leave a group either because of such external events or because of a failure of the member or of computing resources on which the member depends. These external and failure events are called membership events. Ideally, within a short time after any membership event, all remaining members of the group would have the same accurate view of the group. Thus, an ideal membership protocol would have these features: (1) being triggered by some membership event, (2) requires at most a fixed amount of time to complete, (3) results in complete consistency of views of the remaining members, and (4) each remaining member's view consists of exactly the set of remaining members. It is known that no such ideal membership protocol is possible in the presence of crash failures and lost messages.
One strategy for approximating an ideal membership protocol is to assume a high degree of synchrony in the computation and the transport layer of the processing elements. These are referred to as synchronous agreement protocols, such as the one described by Dwork et al. in U.S. Pat. No. 5,513,354. This patent discloses a method for managing tasks in a network of processors in which the processors exchange views as to which processors have failed and update their views based on the views received from the other processors. After a number of synchronous rounds of exchange, the operational processors reach an eventual agreement as to the status of the processors in the system. A failure in the assumed synchrony would lead to either inconsistency or the problem of "blocking".
Another strategy for approximating the ideal membership protocol is to weaken feature (2) above, the termination condition, to require that the protocol must eventually terminate instead of terminating in a fixed amount of time. Such a weaker membership protocol is referred to as an asynchronous agreement or consensus protocol, similar to the one described by T. Chandra et al. in "The Weakest Failure Detector for Solving Consensus," Proceedings of the 11th Annual ACM Symposium on Principles of Distributed Computing, 1992, pp. 147-158. A disadvantage of such a consensus protocol is that there is no guarantee on how long the protocol requires to terminate. So, from a practical point of view, there is no guarantee of termination at all. Moreover, in the presence of communication failures (lost messages) that prevent one subgroup of participants from communicating with another subgroup, it is not even possible to guarantee eventual agreement.
The earlier referenced patent application Ser. No. 08/522,651 describes another method that further weakens the membership conditions to neither require termination (feature 2) nor accuracy (feature 4) and, for safety (feature 3) to require only that, if the views of two members differed then neither member was contained in the view of the other. Membership protocols satisfying these much weaker constrains are said to achieve interactive consistency, and referred to as interactive (or collective) consistency protocols. The advantage of interactive consistency protocols is that they usually terminate quickly and achieve both consistency and accuracy in the sense that a members view of the current membership usually consists of the set of members with whom it could communicate. Their disadvantage is that there are no termination guarantees (i.e., they can not use a time-out), so a protocol might block forever waiting for a message that would never be sent from a member that has crashed.
Still another form of weakened membership protocols, referred to as dynamic uniformity, is described by D. Malki et al. in "Uniform Actions in Asynchronous Distributed Systems," Proceedings of the 13th Annual ACM Symposium on Principles of Distributed Computing, 1994, pp. 274-283. Generally, dynamic uniformity requires that each correctly functioning participant either reaches the same decision as reached by any other participant or is eventually viewed as disabled by others. The main disadvantages with dynamic uniformity protocols are their complexity and possible temporary inconsistency.
Therefore, there is still a need for a membership protocol in a distributed system that is simple and without the above disadvantages. The present invention describes such a protocol subject to an asymmetric safety condition: whenever two members have different views as a result of the proposed membership protocol, at least one regards the other as outside the current membership. In other words, at least one of the two members does not appear as a member identified by the other.