Various routing schemes for delivery of end-to-end information and data over networks are known. They include broadcast, multicast, unicast and anycast. Such schemes usually attempt to deliver data from one point or node to one or more other points or nodes over a network. For example, broadcasting refers to transmitting an information packet to every node on the network and unicasting refers to transmitting information packets to a single destination node.
Multicast is a protocol for the delivery of information to a group of destinations simultaneously over the network. Generally, multicast protocols attempt to use the most efficient process to deliver messages over each link of the network only once, creating copies only when the paths to the destinations split. One implementation of multicast is Internet Protocol (IP) multicast, where routers create distribution paths for datagrams sent to a multicast destination address, while typically not guaranteeing reliability or delivery latency.
There are also other implementations of the multicast distribution strategy. Another example is Ethernet's multicast frame addresses which allow a single Ethernet frame to be delivered to multiple NICs on the same network segment while only traversing the network once. This is done by setting the destination MAC address not to any specific NIC's address, but to a special set of multicast MAC addresses which the NIC cards that are interested in a particular multicast can select to receive. Ethernet switches may duplicate the multicast frames as needed to every port that has an active NIC behind it (i.e. treat it as broadcast), or they may be configured to duplicate the multicast frame to only certain ports so that only NICs that are interested in the multicast will receive it. In both cases a multicast service can be provided by an Ethernet network without any IP network also existing.
Native multicast service is a multicast service provided by a network to a multicast group. For example, IP multicast service is native to an IP network such as the Internet. IP Multicast can scale to a large receiver population for a small number of simultaneous wide-area groups. The limit to a small number of simultaneous wide-area groups is an architectural limitation of multicast at layer 3 because the state of each group must be continually monitored leading to unsustainable overhead. Multicast utilizes network infrastructure efficiently by requiring the source to send a packet only once, even if it needs to be delivered to a large number of receivers. The routers in the network take care of duplicating the packet to reach multiple receivers only where necessary. IP Multicast utilizes such concepts as IP Multicast group addresses, multicast distribution trees and receiver driven tree creation.
IP Multicast over the Internet, however, suffers from a number of drawbacks. It is susceptible to Internet anomalies and thus unreliable. Moreover, implementation of large-scale services over the Internet via IP Multicast is problematic because it is generally not supported by Internet Service Providers (ISPs) or is only supported within a particular ISP's network and not between that network and other networks on the Internet. Other disadvantages of IP Multicast are that the assignment of group identifiers is not coordinated and that the management overhead associated with supporting many wide-area groups is not scalable.
An overlay network is a computer network that is built on top of another network. Nodes in the overlay can be thought of as being connected by virtual or logical links, each of which corresponds to a path, perhaps through many physical links, in the underlying network. An overlay network can implement different types of protocols at the logical level, including protocols materially different from those implemented at the physical level. The concept of overlay networks is often viewed to include many different systems such as P2P, dial-up modems over the telephone network, or even some types of Content Delivery Networks (CDNs). Usually, the usage of overlay networks may come with a price, for example, in added latency that is incurred due to longer paths created by overlay routing, and by the need to process the messages in the application level by every overlay node on the path. A particular class of overlay networks are herein referred to as Message-Oriented Overlay Networks (MOON). MOON is a specific type of overlay network that maintains control and management over the overlay nodes based on communicated messages. One exemplary Message-Oriented Overlay Network is implemented as the Spines system (www.spines.org), which is available as open source, including messaging services similar to those provided at the Internet level such as reliable and unreliable unicast, but with lower latency. In “Resilient Overlay Networks”, David G. Andersen, Hari Balakrishnan, M. Frans Kaashoek and Robert Morris in Proceedings of the ACM SOSP, 2001, describe another example of Message Oriented Overlay Network called the Resilient Overlay Network (RON) technology (also available at http://nms.csail.mit.edu/ron/).
Reliable point-to-point communication is one of the main utilizations of the Internet, where over the last few decades TCP has served as the dominant protocol. In “Reliable Communication in Overlay Networks”, Yair Amir and Claudiu Danilov, in the Proceedings of the IEEE International Conference on Dependable Systems and Networks (DSN03), San Francisco, June 2003, which is hereby incorporated by reference in its entirety, (Yair Amir, a co-author of the paper and co-inventor of the instant application), describe a MOON that uses hop-by-hop reliability to reduce overlay routing overhead and achieves better performance than standard end-to-end TCP connections deployed on the same overlay network. In “An Overlay Architecture for High Quality VoIP Streams”, Yair Amir, Claudiu Danilov, Stuart Goose, David Hedqvist, Andreas Terzis, in the IEEE Transactions on Multimedia, 8(6), pages 1250-1262, December 2006, (referred to as [ADGHT06]) which is hereby incorporated by reference in its entirety, algorithms and protocols are disclosed that implement localized packet loss recovery and rapid rerouting in the event of network failures in order to improve performance in VoIP applications that use UDP to transfer data.
Application-layer multicast (ALM), referred in this document also as overlay multicast, has been implemented in overlay networks to provide multicast at the application layer. The principle of ALM is to route and forward multicast data using software running in host nodes (in terms of the underlying network). The multicast data are tunneled through the underlying Internet using unicast transmission, and the participating host nodes replicate and forward these multicast data to other host nodes in the overlay network until the messages reach the destined receiver nodes.
A known ALM protocol is the NICE protocol proposed by Banerjee et al. in “Scalable application layer multicast,” in: Proceedings of ACM SIGCOMM, August 2002. NICE is a tree-based ALM protocol where peers are arranged hierarchically such that every peer receives data from its parent or siblings and forwards the data to its children and siblings. This protocol has been shown to work well in many applications and networks due to its proximity-aware feature and its capability to dynamically adapt the overlay network topology to the changing network conditions. In a publication titled “Parallel overlays for high data-rate multicast data transfer” which became publicly available on line on May 2006, and later published in Computer Networks: The International Journal of Computer and Telecommunications Networking, Vol 51, issue 1, pages 31-42, K. K. To and Jack Y. B. Lee of Department of Information Engineering, of the Chinese University of Hong Kong, disclosed extending the NICE protocol to use multiple parallel overlays in the same ALM session to spread the data traffic across more available network links in video content distribution applications.
Known systems extend the boundaries of IP multicast via overlay networks that connect IP multicast “islands.” One example of performing multicast communication in computer networks by using overlay routing is disclosed in U.S. Pat. No. 7,133,928 issued to McCanne. Two publications entitled “Universal IP Multicast Delivery” are published by Zhang et al. One publication is in Computer Networks, special issue of on Overlay Distribution Structures and Their Applications, April 2006 and the other is in Fourth International Workshop on Networked Group Communication (NGC), October 2002. In a Technical Report dated April 2000, Paul Francis discloses “Yoid: Extending the Internet Multicast Architecture.”
In such systems, remote users participate in the IP multicast through a unicast tunnel when an existing IP multicast network does not reach all of the locations who wanted to be part of the multicast, for example because of network hardware limitations, restricted cross-Autonomous System IP Multicast, or other reasons. In some cases, the overlay network using unicast connects multiple “islands” of IP multicast connectivity so that all of the users would connect through IP multicast and may not even be aware that they were actually connected by an overlay network. The architecture of bridging IP multicast islands through a unicast overlay seeks to extend the boundaries of IP multicast as it currently exists without any mapping of the overlay group identifier to the IP multicast group identifier, as the address in the overlay was the same as the IP multicast address even if it is tunneled over an IP unicast address.
There remains a significant need in the art to provide a managed but widely distributed network capable of transporting and delivering any group of high quality live flows such that each flow has potentially different source and different destination set, at a truly global scale, thus allowing content providers to maintain control over the distribution of their live content. Further, this content needs to be delivered with minimal latency, consistently high quality, with high reliability, and at an attractive cost. With the advances in power of processing units, there exists a commensurate need for a system, method or protocol for scaling reliable real-time or near real time delivery of large amounts of data, such as Standard Definition (SD) and High Definition (HD) video data, as well as interactivity, for example, in video or online gaming, applications. What is needed is a network that supports any-to-any high quality live flows at global scale delivered with high reliability at attractive economics.