A cluster of computerized servers may be configured to be replicas of one another, by executing replicated instances of an application, such as a computer program, and providing the same input to the instances (i.e., input streams of all instances have the same messages and ordering thereof). A cluster of replicated servers is depicted in US Patent Publication 2008/0310444A1 entitled “Group Communication System Achieving Efficient Total Order and State Synchronization in a Multi-tier Environment”, which is hereby incorporated by reference.
Replicated servers may be aimed at providing high availability of a service provided by the application. All the servers run the same application, process the same input, maintain the same state, and produce the same output. If one of the servers fails, the application remains available as long as at least one server remains functional. In some cases, servers may join the cluster during its operation, and thus after failing, a server may reboot and join the cluster. Joining the cluster may require synchronizing the state of the application to that of the cluster.
The cluster may have a server acting as a “leader”. The term “leader”, in the present specification, should not be construed literally. A leader server may be in charge of determining a message processing order of the cluster. Based on the determined processing order, all servers of the clusters process the messages according to the determined order. A leader may be in charge of managing the cluster, such as by providing new servers state of the application so as to enable them to join the cluster. The role of the leader may be divided to different servers, but for the sake of simplicity, and without the loss of generality, the present disclosure relates to a single leader.
A message that is transmitted to the cluster is transmitted to each server of the cluster. However, due to various reasons, such as, for example, congestion or source failure, some messages may be lost. Message loss presents a problem, as in order to be able to support transparent failover the replicated instances of the application at all the servers should maintain the same state and thus must process exactly the same set of messages and in the same order. In case one server encounters message loss from one of the sources, it cannot continue with delivering messages to the application since this may lead to an inconsistent state. In some exemplary cases, the server may attempt to recover the message or suspend itself from the cluster (and possibly attempt to regain synchronization later and rejoin the cluster). Since message loss can happen independently at each server, all the servers in the cluster may be forced to suspend themselves due to message loss from which they cannot recover. This leads to a failure of the application since the cluster is no longer able to run the application on any server.
Though reliable messaging protocols may be used to moderate the risk of message loss, such as T Speakman et al. “PGM Reliable Transport Protocol Specification”, RFC 3208, (2001), which is hereby incorporated by reference, message loss may still occur. The prior art discloses a variety of mechanisms to overcome message loss problems.
One solution is a use of a Guaranteed Messaging Service (GMS), such as described in Roger Barga, David Lomet, Gerhard Weikum. Recovery Guarantees for General Multi-Tier Applications. In Proceedings of the International Conference on Data Engineering (ICDE), 2002, which is hereby incorporated by reference. The GMS provides a persistent message storage, whereby messages that were not received by a server, can be retrieved from the storage. However, GMS causes a significant overhead that slows down the entire system. Hence, such solution may not be useful in certain configurations, such as high-availability, high throughput and low latency systems.
An additional solution is a use of a Group Communication Systems (GCS). Some GCS log messages and if a server encountered message loss the messages are retrieved from one of the other servers that logged these messages. Various GCSs are disclosed in Chockler et al. “Group Communication Specifications: A Comprehensive Study,” ACM Computing Surveys 33:4 (2001), which is hereby incorporated by reference. However, GCS requires overhead associated with logging messages and distributing the messages between the servers in the cluster. Hence, such solution may not be useful in certain configurations, such as high-availability, high throughput and low latency systems.