When an application is distributed across a cluster of computer nodes, there must be some coordination between cluster membership changes and inter-node messages issued by the distributed application. The main goal of such coordination is to prevent nodes that have recently left or joined the cluster membership from issuing messages to the nodes in the cluster that have not had a chance to process the membership change. The coordination is needed to ensure a consistent and correct execution of the application in the computer cluster. Without this coordination, the distributed application may execute less efficiently or, in the worst case, incorrectly. In most existing cluster systems, the coordination is achieved by integrating a cluster membership manager with an inter-node messaging system. Such a system would flush all inter-node messages that were in transition between the cluster nodes when a cluster membership changed. It then processes the cluster membership change while halting all inter-node communication, and re-enables inter-node communication when all the nodes have processed and accepted the new cluster membership. This technique is known in the technical field as Virtual Synchrony.
A well-known virtual synchrony system is described by K. Birman et al., in the publication entitled “Reliable Communication in the Presence of Failures,” ACM Transactions in Computer Systems, Vol. 5, No. 1, Feb. , 1987. A main disadvantage of this system is that it requires a costly cluster wide synchronization of all inter-node communications with each membership change. The system also halts all inter-node communications until the new cluster membership is accepted by all of the cluster nodes. This requirement significantly affects the performance of any distributed application running in the cluster. In addition, the system is intended for handling mainly multicast communication across the cluster, rather than unicast messaging between cluster nodes. Like other virtual synchronous systems, membership changes need to be synchronized through the whole cluster. In-flight messages are flushed from communication links and buffers when there is a change. Furthermore, new communications among the nodes are halted until the new membership is confirmed by every member node in the cluster. These requirements impose significant disruption and delay to the communication between the nodes that are not involved in the membership change.
Virtual synchronous systems are particularly inefficient for running distributed applications. For this class of applications, a large cluster might be maintained in the membership, but relatively independent distributed tasks might run only in subsets of this membership, i.e., its sub-clusters. This distribution results in inter-node communication patterns that are mostly restricted to sub-clusters.
Another method for filtering of stale messages associated with the nodes affected by a cluster membership change is described by Sreenvasan et al. in “Maintaining Membership in High Availability Systems,” U.S. patent application Ser. No. 20020049845A1. The Sreenvasan method is a receiver-based filtering to eliminate stale membership messages based on an incarnation numbers embedded in inter-node messages. It thus performs the message filtering at the cluster membership daemon level rather than at the application level.
Therefore, there remains a need for a system and method for filtering stale messages caused by membership changes in a distributed computing environment without the above drawbacks of prior art systems.