1. Field of the Invention
The present invention relates to data distribution systems and methods and, more particularly, to a reliable multicast message delivery system and method for achieving reliability while maintaining performance efficiencies in terms of low delivery latency, network traffic, and computational load.
2. Discussion of the Prior Art
In a multicast message delivery system, there typically exists one or more multicast groups, each multicast group having many senders and receivers exchanging information. Multicast is considered reliable if every sender message is delivered to all receivers in the group, even if some receivers are temporarily unavailable due to loss of connectivity or crashes.
Prior art multicast message delivery systems have either sacrificed performance to guarantee reliability or sacrificed reliability for better performance. A log-based receiver-reliable multicast (“LBRM”) system such as described in H. Holbrook, S. Singhal, and D. Cheriton entitled “Log-based Receiver-reliable Multicast for Distributed Interactive Simulation”, Proceeding of ACM SIGCOMM '95, 1995, provides reliable multicast by logging all messages on a dedicated logger or a hierarchy of loggers. A receiver, upon detecting a missing message, can contact a logger to recover the message. Unfortunately, loggers in LBRM are bottlenecks in the system. Under high message send rate situations, loggers will experience high latency, high network traffic, and high CPU load, and consequently slow down the entire system, hence limiting its performance. An alternative prior art approach is to spread the logging capability to all members of a multicast group. A Bimodal multicast (“Pbcast”) system such as described in K. P. Birman, M. Hayden, O. Ozkasap, et al. entitled “Bimodal Multicast”, ACM Transactions on Computer Systems, 17(2):41–88, May 1999 implements randomized gossips to approximate this idea. In this approach, each member of the multicast group will buffer received messages for some time. Instead of always asking dedicated loggers for repairs, Pbcast requires selecting random neighbors in the same multicast group and asks these neighbors to repair missing messages from their buffer. Given infinite amount of buffering space at each member, this approach will give the best performance. However, finite space forces each member to garbage collect buffered messages, hence providing no guarantee that a lost message can always be recovered by gossiping to neighbors. Various optimizations have been proposed to lengthen the buffering time, however, it is the case that these optimizations only increase the probability of delivering messages to all members rather than guaranteeing it.
It would be highly desirable to provide a hybrid multicast system that merges the ideas of logging messages and using random neighbors for repairing messages and further, that provides a strong reliable delivery guarantee as in LBRM, and, at the same time, preserves much of the performance advantages of a Pbcast system in most cases.