An overlay network is a computer network that is built on top of another network. Nodes in the overlay can be thought of as being connected by virtual or logical links, each of which corresponds to a path, perhaps through many physical links, in the underlying network. An overlay network can implement different types of protocols at the logical level, including protocols materially different from those implemented at the physical level. The concept of overlay networks is often viewed to include many different systems such as P2P or dial-up modems over the telephone network. Usually, the usage of overlay networks may come with a price, for example, in added latency that is incurred due to longer paths created by overlay routing, and by the need to process the messages in the application level by every overlay node on the path.
A particular class of overlay networks is Message-Oriented Overlay Network (MOON). MOON is a specific type of overlay network that maintains control and management over the overlay nodes based on communicated messages. MOON provides network services that manipulate the messages which pass through the overlay network to improve the reliability, latency, jitter, routing, or other network properties, or to add new network capabilities and services. One example of MOON is implemented as the Spines system (www.spines.org), which is available as open source, including messaging services similar to those provided at the Internet level such as reliable and unreliable unicast, but with lower latency. It also includes services not practically available at the Internet level such as soft real time unicast and semi-reliable multicast. The Spines system supports multiple flows over a single overlay network, each of which with its own set of senders and receiver nodes.
Resilient Overlay Network (RON) (available at http://nms.csail.mit.edu/ron/) is another example of MOON as disclosed in “Resilient Overlay Networks” by David G. Andersen, Hari Balakrishnan, M. Frans Kaashoek and Robert Morris in Proceedings of the ACM SOSP, 2001. RON provides better connectivity (more resilient routing) by continuously monitoring each overlay site. If there is direct connectivity on the underlying network (the Internet in the case of RON) then the message is sent directly using a single overlay hop. Otherwise, RON uses two overlay hops to pass messages between overlay sites that are not connected directly by the Internet, thus providing better connectivity between sites than connectivity achieved by the native Internet.
In “Reliable Communication in Overlay Networks”, Yair Amir and Claudiu Danilov., in the Proceedings of the IEEE International Conference on Dependable Systems and Networks (DSN03), San Francisco, June 2003, which is hereby incorporated by reference in its entirety, (Yair Amir, a co-author of the paper and co-inventor of the instant application), describe a MOON that uses hop-by-hop reliability to reduce overlay routing overhead and achieves better performance than standard end-to-end TCP connections deployed on the same overlay network. More specifically, in the disclosed MOON, intermediate overlay nodes handle reliability and congestion control only for the links to their immediate neighbors and do not keep any state for individual flows in the system. Packets are forwarded and acknowledged per link, regardless of their originator. This implementation of MOON recovers the losses only on the overlay hop on which they occurred, localizing the congestion and enabling faster recovery. Since an overlay link has a lower delay compared to an end-to-end connection that traverses multiple hops, the losses can be detected faster and the missed packet can be resent locally. Moreover, the congestion control on the overlay link can increase the congestion window back faster than an end-to-end connection, as it has a smaller round-trip time. Hop-by-hop reliability involves buffers and processing in the intermediate overlay nodes. The overlay nodes deploy a reliable protocol, and keep track of packets, acknowledgments and congestion control, in addition to their regular routing functionality, thereby allowing for identification of congestions in the overlay network level.
In “An Overlay Architecture for High Quality VoIP Streams”, Yair Amir, Claudiu Danilov, Stuart Goose, David Hedqvist, Andreas Terzis, in the IEEE Transactions on Multimedia, 8(6), pages 1250-1262, December 2006, (referred to as [ADGHT06]) which is hereby incorporated by reference in its entirety, algorithms and protocols are disclosed that implement localized packet loss recovery between a source node and a sender node and rapid rerouting in the event of network failures in order to improve performance in VoIP applications that use UDP to transfer data. The disclosed packet loss recovery detects out of order arrival of sequenced packets at the receiver node. Upon detection, the receiver node immediately transmits to the sender node a single request for retransmission of the packet(s) that based on the out of order arrival, are suspected as being lost. Upon receiving the request for retransmission, the sender node immediately retransmits the requested transmitted packets. The algorithms are deployed on the routers of an application-level overlay network and have shown to yield voice quality on par with the PSTN. Similar ideas were expressed in “1-800-OVERLAYS: Using Overlay Networks to improve VoIP quality” with the same authors in the Proceedings of the International Workshop on Network and Operating Systems Support for Digital Audio and Video (NOSSDAV) pages 51-56, Skamania, Wash., 2005 (referred to as [ADGHT05]).
Overlay networks are being used for reliable real-time or near real time delivery of large amounts of data, such as Standard Definition (SD) and High Definition (HD) video data as well as live and interactive video and online gaming, among other applications. Various routing schemes for delivery of end-to-end information and data over an overlay networks are known. They include broadcast, multicast, unicast and anycast. For example, broadcasting refers to transmitting an information packet to every node on the overlay network and unicasting refers to transmitting packets to a single receiver node. Multicast is a protocol for the delivery of information to a group of destinations simultaneously over the overlay network.
Reliable point-to-point communication is one of the main utilizations of the Internet, where over the last few decades TCP has served as the dominant protocol. Developers often use TCP connections to realize reliable point-to-point communication in distributed systems. Over the Internet, reliable communication is performed end-to-end in order to address the severe scalability and interoperability requirements of a network in which potentially every computer on the planet could participate. Thus, all the work required in a reliable connection is distributed only to the two end nodes of that connection, while intermediate nodes route packets without keeping any information about the individual packets they transfer.
Many mechanisms for increasing the reliability of packet transmissions are known. They can generally be characterized as protocols that require either additional latency or bandwidth in exchange for reliability. Protocols that increase latency are most commonly retransmission protocols involving a sender node and a receiver node, where either the receiver node sends a retransmission request to the sender upon determination of loss of one or more packets and, or the sender node fails to receive a positive acknowledgment from the receiver and retransmits one or more transmitted packets. Retransmissions of transmitted packets, however, use additional bandwidth when a transmitted packet loss is indicated. Thus, conventional packet recovery protocols incur additional latency because the lost packets could not be delivered until packet losses are first detected, requests for retransmissions of the lost packets are sent, and the retransmission of the packets are received.
Forward Error Correction (FEC) is another known method for packet recovery where the sender node adds redundant information, known as an error-correction code, to transmitted packets. This allows the receiver node to detect and correct errors (within some bound) without the need to ask the sender node for additional data. The advantages of forward error correction are that a feed back channel is not required and retransmission of data can often be avoided at the cost of higher bandwidth requirements, on average. For this reason, FEC is usually applied in situations where retransmissions are relatively costly or impossible. However, FEC requires additional bandwidth for sending the redundant information at the time of original transmission (or later) to enable the receiver node to reconstruct the transmitted packets even when portions or entire transmitted packets are lost.
One type of FEC uses erasure codes to divide the transmitted packets into a number of blocks. If a specified fraction of these packets arrive at the receiver node, then the received packets can be decoded to reconstruct the originally transmitted packets. The bandwidth overhead of FEC depends on the specific codes used and certain parameters that specify a level of redundant data and a threshold for decoding the original message. Another type of redundant data transmission is sending packets over multiple non-overlapping network paths. Then, if a packet is lost on one path, it may arrive on another path.
Many applications require packets to be received in a timely manner. These include, but are not limited to, voice over IP, streaming video, interactive streaming media, and networked games. Real-time delivery of messages requires meeting very strict deadlines. Real-time messages must be delivered to an end receiver node before the data needs to be displayed, played, or acted upon, which imposes an end-to-end deadline for message delivery. The end-to-end deadline, in turn, imposes a deadline for packet transmission across each hop (i.e. link) in a message's path. In a deadline driven packet delivery model, a transmitted packet would be equivalent to a lost packet if it arrives at its destination after a required temporal deadline. Packets transmitted over network links may be lost because of signal corruption in transit, congestion along the link, or buffer overflows in queues along the path.
Therefore, there exists a need for a system and a method that efficiently use resources to increase the probability of messages arriving at receiver nodes on time, while providing the flexibility of balancing reliability relative to bandwidth and CPU cost.