Field of the Invention
The present invention generally relates to distributed computing systems, and more specifically, to maintenance of a distributed system membership view.
Background Art
Distributed computing systems are complex aggregations of members or units that communicate with each other through an interconnect in order to achieve some common goal. A distributed system may comprise multiple individual processors linked in a network, or a plurality of software processes or threads operating concurrently in a coordinated environment. In a network configuration, the processors communicate with each other through a network that supports a network protocol. This protocol may be implemented using a combination of hardware and software components. In a coordinated software environment, the software processes are logically connected together through some communication medium such as an Ethernet network. Whether implemented in hardware, software, or a combination of both, the individual elements of the network are referred to individually as members, and together as a group.
A robust distributed system must take into account the fact that its forming members may fail or become inaccessible at any time, while the system still needs to continue working by using the members available. Typically, each process in a distributed system maintains information, which may be updated, regarding the configuration of the system as a whole. To this purpose, processes often maintain a “view,” which is a data structure representing the membership of the distributed system (i.e., a set of processes that constitute the system, and each process in the view is a member).
A soft-state protocol for the membership of a distributed system is one in which the available members are not hard-coded and known in advance when the system is initialized. Rather, the members themselves make known their presence and location to the others by means of sending a message containing this information through the interconnect, so that each member discovers the available members at some point in time. Furthermore, each member periodically resends this message every time period T so that others know that the originating member is still available. Each member is interested in knowing the other members availability at some point in time, so that they can work together to achieve whatever function the distributed system is aimed at. In order to do that, each member maintains a view of the current membership of the distributed system, formed by the locations and identities in the messages received from other members.
New arriving members are added to this view when their messages are received. But members that are not available anymore should be taken out of the view. Otherwise the view would not be consistent with the members actually available; and in the long term, the view would grow without limit (as members may leave the system and integrate into it again later with a different location or identity), uselessly consuming resources at each member. However, even if messages are sent periodically, a member cannot state that another is not anymore part of the system just because a message has not arrived for one period of time T: the message may have been lost in the interconnect, or the member may be sending messages too slowly due to a high load of processing in that member at that time. An explicit message from a member telling that it is about to leave the system would not solve the problem, as we have to consider the case where the member does not work properly anymore, or cannot contact the others through the interconnect.
In a typical implementation, the maintenance of the view is implemented as follows. When a message from a member arrives, the identity of the member sending the message is stored in the view together with the time of reception of the message. Periodically, with a given period T, it is verified for every member that the reception of its last message has not occurred more than a given limit number (possibly fractional) of periods ago. If, for a given member, the limit is exceeded, the member is finally considered not to be part of the distributed system. This method requires checking every single member in the view at each period, which is inefficient and may be prohibitively costly in low resource members (such as sensors) or in systems with a very large membership (such as Peer-to-peer networks).