It is known to transmit live flows of information or content, e.g., linear television programming and event videos, over satellite networks. Satellite networks are architected to provide one-to-many connectivity. A single television channel, for example, can be uplinked from a ground-based antenna to a satellite transponder and then broadcast to any number of ground-based receiver antennas in the footprint of that satellite's signal. This method has served the television industry for over three decades and has economic appeal to content providers who have a large group of receivers that all want the same content at the same time. However, several issues that are not addressed well by this architecture. For example, content that is customized to different sub-groups of receivers requires dedicated transponder space for each unique flow, which in turn makes this architecture economically unattractive for these content providers. Also, satellites are not architected for two-way flows, which reduces their appeal to interactive content providers. Lastly, for a variety of reasons, live video content has grown exponentially over the last several years. These content providers need a system that has the flexibility to add or drop new content quickly, easily and inexpensively, and to be able to economically deliver this content only to those receivers that specifically request it. Satellite architecture does not provide the flexibility, timeliness or cost structure to support these content providers.
The Internet has seen an explosion in video traffic over the last several years. While most of the video traffic available over the Internet has been non-live short-form videos like those available on sites like YouTube, there has been a growth in live streaming over the Internet. A live source of video, like a standalone digital camera or cell phone with a built-in camera, can upload a live video to a media server site. Users with compatible media players can request the video from the server by using the World Wide Web and typing in the appropriate URL from a browser or media player. The media server then streams the live video to the user. Professional content providers, however, have not adopted Internet streaming in a significant way for several reasons. First, the architecture used by the Internet is unicast, which means each user gets a unique stream that traverses the network from the server to the user. For content with a large audience, this is prohibitively expensive in bandwidth. Second, the Internet has poor reliability and consistency in terms of delay, loss of packets and jitter, all of which can substantially degrade the quality of the delivered video. Media players use large buffers to smooth out these artifacts, which (a) cannot handle long-lived issues, and (b) add significant latency to a live stream.
Dedicated point-to-point links, for example those over optical fiber, are used to ingest professional live video content from a source location, like a studio or a stadium, to a broadcasting site. These networks have the advantage of consistent performance and low latency, thus enabling them to deliver high quality video with little or no degradation. However, they have the disadvantage of being point-to-point transport methods and therefore cannot economically deliver these live flows to multiple locations.
Various routing schemes for delivery of end-to-end information and data over a network are known. They include broadcast, multicast, unicast and anycast. Such schemes usually attempt to deliver data from one point or node to one or more other points or nodes over a network. For example, broadcasting refers to transmitting an information packet to every node on the network and unicasting refers to the transmitting information packets to a single destination node.
Multicast is a protocol for the delivery of information to a group of destinations simultaneously over the network. Generally, multicast protocols attempt to use the most efficient process to deliver messages over each link of the network only once, creating copies only when the paths to the destinations split. One implementation of multicast is on the Internet Protocol (IP) routing level, where routers create distribution paths for datagrams sent to a multicast destination address, while typically not guaranteeing reliability or delivery latency. But there are also other implementations of the multicast distribution strategy.
IP Multicast is a technique for one to many communication over an IP infrastructure. It can scale to a large receiver population for a small number of wide-area groups. The limit to a small number of wide-area groups is an architectural limitation of multicast at layer 3 because the state of each group must be continually monitored leading to unsustainable overhead. The sender does not need to know the identity of the receivers. Multicast utilizes network infrastructure efficiently by requiring the source to send a packet only once, even if it needs to be delivered to a large number of receivers. The routers in the network take care of replicating the packet to reach multiple receivers only where necessary. IP Multicast utilizes such concepts as IP Multicast group addresses, multicast distribution trees and receiver driven tree creation.
IP Multicast over the Internet, however, suffers from a number of drawbacks. It is susceptible to Internet anomalies and thus unreliable. Moreover, implementation of large-scale services over the Internet via IP Multicast is problematic because it is generally not supported by Internet Service Providers (ISPs). Other disadvantages of IP Multicast are that the assignment of group identifiers is not coordinated and that the management overhead associated with supporting many wide-area groups is not scalable.
A peer to peer (P2P) network uses diverse connectivity between end-user participants in a network and their cumulative bandwidth, rather than conventional centralized resources where a relatively small number of servers provide the core bandwidth and computing for a service or application. P2P networks are typically used for connecting nodes via largely ad hoc connections. P2P networks have been used for sharing files containing audio, video, data or anything in digital format. P2P networks, however, do not provide manageability and control since content passes through many third-party hosts. Additionally, they introduce higher and unpredictable delay for information dissemination, which inhibits effective interactivity. P2P networks also require special client software that may introduce security concerns.
Also known are content delivery or distribution networks referred to as (CDN). A CDN is a system of computers or storage devices networked together across the Internet that cooperate transparently to deliver content to end users, most often for the purpose of improving performance, scalability, and cost efficiency. Storage based CDNs, however, require large amounts of storage and a significant investment in infrastructure to support large scale and high bandwidth or high speed applications like HD video broadcast. Storage and additional required IO operations add delay both during the copy to edge nodes, as well as during playback from edge nodes to clients. As a result, such systems do not provide live, real-time distribution to all of their users. Additionally, such systems do not provide fine-grained synchronization between all of the viewers and do not support bi-directional interactive content exchange (such as a town-hall meeting with remote participants). Such systems have a high fixed cost to support large scale and high quality video broadcast. Implementation of a storage-based CDN requires the purchase of storage devices and servers based on potential use, not actual use.
Known online gaming technologies connect players together over a computer network, such as the Internet. Massively multiplayer online games (MMOGs) have been developed using client-server system architectures to create diverse game worlds and communities. The software that generates and persists the “world” runs continuously on a server, and players connect to it via client software. The client software may provide access to the entire playing world. Depending on the number of players and the system architecture, a MMOG might actually be run on multiple separate servers, each representing an independent instance of the world, where players from one server cannot interact with those from another. In many MMOGs, the number of players in one instance of the world is often limited to a few thousands. In this way, various servers provide various instances of the games that are shared by the players. However, players under MMOG architecture do not have a global all-inclusive view of the world since different users are on different instances of the game. Additionally, the scalability of MMOGs is impacted by the locations the clients connect from and the real-time requirements of the actual games.
Overlay networks are opening new ways to Internet usability, mainly by adding new services (e.g. built-in security) that are not available or cannot be implemented in the current Internet, and also by providing improved services such as higher availability. An overlay network is a computer network that is built on top of another network. Nodes in the overlay can be thought of as being connected by virtual or logical links, each of which corresponds to a path, perhaps through many physical links, in the underlying network. An overlay network can implement different types of protocols at the logical level, including protocols materially different from those implemented at the physical level. The concept of overlay networks is often viewed to include many different systems such as P2P, dial-up modems over the telephone network, or even some types of CDNs which are discussed previously. Usually, the usage of overlay networks may come with a price, for example, in added latency that is incurred due to longer paths created by overlay routing, and by the need to process the messages in the application level by every overlay node on the path. An overlay network constructs a user level graph on top of an existing networking infrastructure such as the Internet, using only a subset of the available network links and nodes. An overlay link is a virtual edge in this graph and may consist of many actual links in the underlying network. Overlay nodes act as routers, forwarding packets to the next overlay link toward the destination. At the physical level, packets traveling along a virtual edge between two overlay nodes follow the actual physical links that form that edge. Overlay networks have two main drawbacks. First, the overlay routers incur some overhead every time a message is processed, which requires delivering the message to the application level, processing it, and resending the message to the next overlay router. Second, the placement of overlay routers in the topology of the physical network is often far from optimal, because the creator of the overlay network rarely has control over the physical network (usually the Internet) or even the knowledge about its actual topology. Therefore, overlay networks provide longer paths that have higher latency than point-to-point Internet connections. The easiest way to achieve reliability in overlay networks is to use a reliable protocol, usually TCP, between the end points of a connection. This mechanism has the benefit of simplicity in implementation and deployment, but pays a high price upon recovery from a loss. As overlay paths have higher delays, it takes a relatively long time to detect a loss, and data packets and acknowledgments are sent on multiple overlay hops in order to recover the missed packet.
A particular class of overlay networks are herein referred to as Message-Oriented Overlay Networks (MOON). MOON is a specific type of overlay network that maintains control and management over the overlay nodes based on communicated messages. MOONs provide network services that manipulate the messages which pass through the overlay network to improve the reliability, latency, jitter, routing, or other network properties, or to add new network capabilities and services. MOONs do not use persistent storage to store data messages during transit.
Reliable point-to-point communication is one of the main utilizations of the Internet, where over the last few decades TCP has served as the dominant protocol. Over the Internet, reliable communication is performed end-to-end in order to address the severe scalability and interoperability requirements of a network in which potentially every computer on the planet could participate. Thus, all the work required in a reliable connection is distributed only to the two end nodes of that connection, while intermediate nodes route packets without keeping any information about the individual packets they transfer.
In “Reliable Communication in Overlay Networks,”, Yair Amir and Claudiu Danilov., in the Proceedings of the IEEE International Conference on Dependable Systems and Networks (DSN03), San Francisco, June 2003, which is hereby incorporated by reference in its entirety, (Yair Amir, a co-author of the paper and co-inventor of the instant application), describe a MOON that uses hop-by-hop reliability to reduce overlay routing overhead and achieves better performance than standard end-to-end TCP connections deployed on the same overlay network. More specifically, in the disclosed MOON, intermediate overlay nodes handle reliability and congestion control only for the links to their immediate neighbors and do not keep any state for individual flows in the system. Packets are forwarded and acknowledged per link, regardless of their originator. This implementation of MOON recovers the losses only on the overlay hop on which they occurred, localizing the congestion and enabling faster recovery. Since an overlay link has a lower delay compared to an end-to-end connection that traverses multiple hops, the losses can be detected faster and the missed packet can be resent locally. Moreover, the congestion control on the overlay link can increase the congestion window back faster than an end-to-end connection, as it has a smaller round-trip time. Hop-by-hop reliability involves buffers and processing in the intermediate overlay nodes. The overlay nodes deploy a reliable protocol, and keep track of packets, acknowledgments and congestion control, in addition to their regular routing functionality, thereby allowing for identification of congestions the overlay network level.
In “An Overlay Architecture foe High Quality VoIP Streams,”, Yair Amir, Claudiu Danilov, Stuart Goose, David Hedqvist, Andreas Terzis, in the IEEE Transactions on Multimedia, 8(6), pages 1250-1262, December 2006, (referred to as [ADGHT06]) which is hereby incorporated by reference in its entirety, algorithms and protocols are disclosed that implement localized packet loss recovery and rapid rerouting in the event of network failures in order to improve performance in VoIP applications that use UDP to transfer data. The algorithms are deployed on the routers of an application-level overlay network and have shown to yield voice quality on par with the PSTN. Similar ideas were expressed in “1-800-OVERLAYS: Using Overlay Networks to improve VoIP quality” with the same authors in the Proceedings of the International Workshop on Network and Operating Systems Support for Digital Audio and Video (NOSSDAV) pages 51-56, Skamania, Wash., 2005 (referred to as [ADGHT05]).
One exemplary Message-Oriented Overlay Network is implemented as the Spines system (www.spines.org), which is available as open source, including messaging services similar to those provided at the Internet level such as reliable and unreliable unicast, but with lower latency. It also includes services not practically available at the Internet level such as soft real time unicast and semi-reliable multicast. The Spines system relates to the use of overlay networks to deliver multi-media traffic in real time. It supports multiple flows, each of which with its own set of senders and receivers, over a single overlay network. Spines does not support multiple overlay networks.
In “Resilient Overlay Networks”, David G. Andersen, Hari Balakrishnan, M. Frans Kaashoek and Robert Morris in Proceedings of the ACM SOSP, 2001, describe the Resilient Overlay Network (RON) technology. RON is another example of Message Oriented Overlay Network. It provides better connectivity (more resilient routing) by continuously monitoring each overlay site to each other overlay site connectivity. If there is direct connectivity on the underlying network (the Internet in the case of RON) then the message is sent directly using a single overlay hop. Otherwise, RON uses two overlay hops to pass messages between overlay sites that are not connected directly by the Internet, thus providing better connectivity between its sites then the connectivity achieved by the native Internet.
Existing message-oriented overlay networks, however, have some significant limitations. Existing MOONs are architecturally limited such that every overlay node will correspond to only one computer. This means that the capacity through that overlay node is limited to what that computer can do, and the availability of that node is limited to the reliability of that single computer. Moreover, the capacity problem is exacerbated because message processing at the overlay level is typically compute intensive. The current invention shows how to solve this limitation.
Broadband network services are being rapidly deployed around the world to serve the data transfer needs in residential, commercial, industrial, as well as government and military applications. With the availability of rich media content on the Internet, new channels are becoming available for mass media content distribution, with or without interactivity. Many of the traditional TV contents and many new video contents are now being regularly distributed and streamed over IP networks. The same content is distributed to and received by a large number of receiver nodes. However, at the present time, the Internet does not support distribution of live high definition content over a large number of channels directed to groups of content users. This is because the Internet does not practically support native multicast at the network level. Regional, semi-private broadband networks operated by a single service provider have implemented proprietary technology in network applications that can benefit such network multicast as IP multicast. However, such proprietary networks serve a limited number of users and cannot be applied for delivery of content on a global basis without significant infrastructure investment.
Application-layer multicast (ALM), referred in this document also as overlay multicast, has been implemented in overlay networks to provide multicast at the application layer. The principle of ALM is to route and forward multicast data using software running in host nodes (in terms of the underlying network). The multicast data are tunneled through the underlying Internet using unicast transmission, and the participating host nodes replicate and forward these multicast data to other host nodes in the overlay network until the messages reach the destined receiver nodes.
A known ALM protocol is the NICE protocol proposed by Banerjee et al. in “Scalable application layer multicast,” in: Proceedings of ACM SIGCOMM, August 2002. NICE is a tree-based ALM protocol where peers are arranged hierarchically such that every peer receives data from its parent or siblings and forwards the data to its children and siblings. This protocol has been shown to work well in many applications and networks due to its proximity-aware feature and its capability to dynamically adapt the overlay network topology to the changing network conditions.
In a publication titled “Parallel overlays for high data-rate multicast data transfer” which became publicly available on line on May 2006, and later published in Computer Networks The International Journal of Computer and Telecommunications Networking, Vol 51, issue 1, pages 31-42, K. K. To and Jack Y. B. Lee of Department of Information Engineering, of the Chinese University of Hong Kong, disclosed extending the NICE protocol to use multiple parallel overlays in the same ALM session to spread the data traffic across more available network links in video content distribution applications. More specifically, a parallelized version of the NICE protocol, P-NICE, separates a single data stream into multiple sub-streams, and then send each sub-stream over an independent multicast overlay network without any coordination between the parallel overlay networks. More specifically, in each ALM session, k overlays are built independently using the NICE protocol. Each peer is then sub-divided into k virtual peers (VP), with each virtual peer joining a different NICE overlay. To transmit data, the sending peer first packetizes data into packets of size Pk bytes and then distributes them to the virtual peers in a round-robin manner. The virtual peers in turn send them over the k NICE overlays independently. To receive data, the virtual peers of the receiving peer first receives the packets from the overlays, and then resequences them in the proper order before passing them to the application. Different overlays route data over disjointed links to utilize the available network capacity, and high-capacity links are utilized by routing multiple overlays through them.
The main drawback of constructing k overlays for the same ALM session is the increased control overheads. In the original NICE protocol, peer overlay nodes of each parallel overlay continue to probe each other periodically to monitor any changes in the parallel overlay network conditions in response to which overlay topology is rearranged to improve performance. Each rearranged topology triggers additional topology rearrangements causing a positive feedback loop that can destabilize the network. Prior art attempt try to decrease trigger sensitivity to provide more stability. K. K. To et al. suggest reducing control overheads through sharing measurement information across multiple overlays without specifying how such measurement information is shared. K. K. To et al. also suggest splitting the measurement tasks between the multiple overlays to further reduce control overheads and developing intelligent ways to control and optimize the number of deployed overlays and dynamically adapt the number in response to changing network conditions to improve throughput. The work described in the paper is limited to a single stream (flow) and does not describe supporting multiple flows with different sources and destination sets.
There remains a significant need in the art to provide a managed but widely distributed network capable of transporting and delivering any group of high quality live flows such that each flow has potentially different source and different destination set, at a truly global scale, thus allowing content providers to maintain control over the distribution of their live content. Further, this content needs to be delivered with minimal latency, consistently high quality, with high reliability, and at an attractive cost. With the advances in power of processing units, there exists a commensurate need for a system, method or protocol for scaling reliable real-time or near real time delivery of large amounts of data, such as Standard Definition (SD) and High Definition (HD) video data, as well as interactivity, for example, in video or online gaming, applications. What is needed is a network that supports any-to-any high quality live flows at global scale delivered with high reliability at attractive economics.