The present invention relates to efficient transmission of data in an internetwork, such as the global internetwork known as the “Internet”. More specifically, the present invention relates to moving live or stored “broadcast” data streams from content producers to large numbers of recipients of those data streams.
“Broadcasting” refers to the transmission of a data stream from a content producer to a large number of recipients. The data stream can be text, graphics, video, audio, or any other digital data stream. Data is often provided as a stream or as a file, with the distinction being that the end of the stream is open-ended, while the file has a defined end. For example, real-time stock quotes might be thought of as a stream of data, while a 30-minute audiovisual presentation might be thought of as a file of data. As referenced herein, a sharp distinction is not needed between what is a stream and what is a file, since the typical broadcast operation is very similar whether a stream is being transmitted or a file is being transmitted. Therefore, it should be understood that where a stream is described, a file could be substituted therefor unless otherwise indicated.
Broadcasting need not be done in real-time relative to the content creation. Real-time broadcasting refers to the transmission of the data as it is created in a digital form. For example, a football game might be recorded by a camera, digitized and broadcast to many individuals wanting to receive the transmission over the Internet. The football game might also be stored after digitization and broadcast at a later time. Furthermore, the football game could be both transmitted live and transmitted at a later time (“delayed broadcast”). Generally speaking, whether the broadcast is live or delayed, some of the components of a broadcast network might operate exactly the same, as is the case with current television broadcasting. For example, the antennas broadcasting the signal and the receivers receiving the signal operate identically to receive live broadcasts or delayed broadcasts.
One technical difference between live broadcasting and delayed broadcasting is that live broadcasting is likely to have a larger audience at the time of the broadcast, since there is only one time to tune into a live broadcast but many times might be available for a delayed broadcast. Some content is more likely to be desired by recipients as a live broadcast rather than a delayed broadcast. Examples include sporting events, time-sensitive business information such as stock quotes, analyst interviews and breaking news, and the like.
The line between live and delayed broadcasting is not a fixed line. One of the challenges of live broadcasting is to process the data stream in real-time to make it suitable for transmission (e.g., compression, formatting), whereas more time is available for those processing steps if the data stream is a delayed broadcast. While that challenge highlights a distinction between live and delayed broadcasting, if the delayed broadcasts are only available at set times, as is the case with television reruns, live and delayed broadcasting do not differ greatly. Because the line is not always clear, it should be understood that “broadcasting” refers to live and/or delayed broadcasting unless otherwise indicated.
Relative to the current demands of Internet users, current television broadcasting is simple: the content creators provide their content to the broadcasters, who send out the data stream into a channel that is exclusively reserved for their content and has the bandwidth to carry that content in the time allotted and the medium of transfer (being wired or wireless) and the receivers are all connected to the medium with a bandwidth sufficient to receive all the data stream with minimal processing from a channel dedicated to that content. On the other hand, broadcasting content over the Internet (or any other internetwork or network being used) cannot be done easily, as the Internet or network is essentially a point to point transmission medium, with some provision for point-to-multipoint or multipoint-to-multipoint transmission.
For example, broadcast television deals with a breaking news event by gathering information, writing a script and putting a reporter on the air. The recipients of the breaking news (the television watchers) must wait for the broadcast television station to broadcast the information and they get only the data stream that that content provider chooses to present. When news breaks on the Internet, a large number of users will try to retrieve the news information (essentially as a large number of point-to-point transmissions of the same data stream), often swamping the servers and computing infrastructure of the content provider. This “flash effect” is not limited to breaking news, but is often encountered when live events occur, when new releases of popular software are issued or when a popular Web site is encountered. Herein, a “Web site” generally refers to a collection of pages presented as a unit, usually served from one or more coordinated servers having a particular network address and may also refer to the computers and infrastructure that serve the pages of the collection.
The problems of current broadcasting approaches are described below, but first some background of client-server architecture is in order. Many networking and other computing systems have the processing and functionality of the overall system separated into “clients” and “servers” where the clients are computers, programs or hardware that initiate requests and servers are computers, programs or hardware that respond to requests from clients. There are exceptions, where devices or programs generally thought of as servers will make requests of devices or programs generally thought of as clients, but for the most part, in the client-server model, the servers wait around for requests, service the requests and then wait around for further requests. Clients are usually considered more independent actors, in that they initiate requests. It should be understood, however, that some devices or hardware could be clients at some times or for some purposes and servers at other times or for other purposes.
In the context of an extremely basic broadcast infrastructure, a content server waits for a request from a content client and upon receipt of a request sends the requested content to the content client. This basic infrastructure is fine when one client makes a request of one server and the content fits in unused bandwidth of a channel connecting the client and server, but since most networks have more than one client or more that one server and share a limited bandwidth, the bandwidth needs to be intelligently allocated.
From an infrastructure perspective, the flash effect is not very bandwidth-efficient, as many, many identical copies of the data stream are transported over the network to the many recipients requesting the data stream. This effect might not be a problem if the data stream is a few bits of data, but data streams of full-motion video and CD-quality audio are becoming more and more common.
Several different approaches have been made in the past to provide for broadcasting over the Internet, but most have drawbacks that prevent their widespread adoption. Two key mechanisms for the Internet have been proposed and are in limited used to overcome the problems induced by the flash effect, namely, 1) caching and 2) server replication. Caching refers to a process of using a cache situated at strategic locations within the network infrastructure to intercept content requests from clients so that the content source does not need to provide every copy of the content. When a client requests content from a content server and the client receives the content from the content server, a cache in the network through which the content passes stores a copy of the content. When other client (or the same client) makes a request for that same content, the network infrastructure consults the cache to determine if a copy of the requested content exists in the cache. If the content exists in the cache, the request is intercepted before it gets to the content server and the cache instead services the request. Otherwise, if the content is not present in the cache, the request is relayed to the content server and the response relayed back to the client.
Caching is useful when there is a high probability that the requested content would happen to be present in the cache. Since the cache has a finite storage capacity allocated for storing cached content, the cache will eventually have to discard some of its stored content to make room for more recent, or more popular, content. Many strategies have been proposed and are in use for managing the local store of the cache, e.g., deciding when to discard an object from the cache, when to “refresh” content (get a fresh, possibly updated copy of content from the content server), and so forth.
Caching can be either transparent or nontransparent. With transparent caching, the client makes a request of the content server and the network infrastructure intercepts the request if the cache can serve the request. With nontransparent caching, the client makes the request of the cache (or more precisely, of a network node to which the cache is attached) and the cache serves the request, if it can, or forwards the request to the content server and then serves the client the content returned from the content server.
The server replication mechanism involves replicated servers each holding copies of the same content being served. Preferably, the replicated servers are deployed across a wide area of the network and client requests to a content server are redirected to one of these distributed replicated servers to balance load and save network bandwidth. For example, if the clients making requests are all connecting to a network at one network entry point and the content server is at the far end of the network, the replicated servers might be located near the client network entry point so that the content does not need to travel the length of the network. These replicated servers may have some or all of the content contained at the origin content server and many variations exist for arranging particular servers in a replicated server deployment, for distributing content to the replicated servers from the origin content server, and for determining how clients are redirected to the appropriate replicated server.
A similar content distribution problem involves the delivery of live streaming media to many users across the Internet. With live streaming media, a server produces a live broadcast feed and clients connect to the server using streaming media transport protocols to receive the broadcast as it is produced. As more and more clients tune in to the broadcast, the server and network near the server become overwhelmed by the task of delivering a large number of packet streams to a large number of clients. This task is unnecessarily duplicative, because the server is sending out multiple streams of the same data (one stream per client).
The duplication exists because each connection from one client to the server is a “unicast” connection, i.e., a one-point-to-one-point connection. The basic connection between two points in a network such as the Internet is a unicast connection. Although unicast data may flow over many different paths (routes), it is identifiable as data from one source node at a source address to one destination node at a destination address. Because of this, each client needs its own connection to the server and the data stream is duplicated in the network by the number of clients requesting that data stream.
Network multicasting partially solves the problem of unnecessary duplication of data streams. Multicasting at the network layer can be done over the Internet using IP multicasting protocols that are defined in the Internet architecture. With multicasting, a content server transmits the data stream as a single stream of packets addressed to a “multicast group” instead of sending individual copies of the stream to individual unicast addresses. While a client normally receives only packets addressed to that client's unicast address, a client interested in the multicasted stream can “tune in” to the broadcast by subscribing to the multicast group. In IGMP (the Internet Group Management Protocol), the client subscribes to an “IP Multicast” group by signaling to the nearest router with subscription information. The network efficiently delivers the broadcast to each receiver client by carrying only one copy of the data stream and fanning out additional copies only at fan out points in the distribution path from the source (the content server) to the receivers. Thus, only one copy of each packet appears on any physical link.
Unfortunately, a wide variety of deployment and scalability problems have confounded the acceptance and proliferation of IP Multicast in the global Internet. Many of these problems follow fundamentally from the fact that computing a multicast distribution tree requires that all routers in the network have a uniformly consistent view of what that tree looks like. To use IP multicasting effectively, each router must have the correct local view of a single, globally consistent multicast routing tree. If routers have disparate views of a given multicast tree in different parts of the network, then routing loops and black holes are inevitable. A number of other problems—e.g., multicast address allocation, multicast congestion control, reliable delivery for multicast, etc.—have also plagued the deployment and acceptance of IP Multicast. Despite substantial strides recently toward commercial deployment of IP Multicast, the resulting infrastructure is still relatively fragile and its reach is extremely limited.
Not only have there been substantial technical barriers to the deployment of a ubiquitous Internet multicast service, but there are business and economic barriers as well. Internet service providers have not had much success at offering wide-area multicast services because managing, monitoring, and provisioning for multicast traffic is quite difficult. Moreover, it is difficult to control who in a multicast session can generate traffic and to what parts of the network that traffic is allowed to reach. These problems become even worse when service providers attempt to peer with one another to offer a wider-reaching multicast service, as they have done with resounding success for traditional unicast service. Because of these barriers, the emergence of a multicast service that reaches the better part of the Internet is unlikely and such emergence in the near future is very unlikely.
Others have proposed work-arounds to avoid the pitfalls of multicast, such as splitter networks. A splitter network is an application-level solution for transporting streaming-media broadcasts, where a set of servers is distributed across a network at strategic locations across the Internet. For example, a data distributor might co-locate splitters at an ISP's premises or make an arrangement with the ISP for a large-scale deployment within the ISP's network. For example, RealNetworks, of Seattle, Wash., provides for streaming media distribution. The distribution is at the application level in that a RealNetworks™ G2 server might send G2 data streams to G2 clients.
These distributed servers are configured with a “splitting” capability, which allows them to replicate a given stream to a number of downstream servers. With this capability, servers can be arranged into a tree-like hierarchy, where the root server sources a stream to a number of downstream servers, which in turn split the stream into a number of copies that are forwarded to yet another tier of downstream servers.
Unfortunately, a splitter network of servers is plagued with a number of problems. First, the tree of splitters is statically configured, which means that if a single splitter fails, the entire sub-tree below the point of failure loses service. Second, the splitter network must be oriented toward a single broadcast center, requiring separate splitter networks composed of distinct physical servers to be maintained for each broadcast network. Third, splitters are typically specific to one data stream format making the splitter platform dependent. For example, a splitter set up to carry RealNetworks™ data streams cannot distribute Microsoft™ Netshow™ data streams. Fourth, splitter networks are highly bandwidth inefficient since they do not track receiver interest and prune traffic from sub-trees of the splitter network that have no downstream receivers. Finally, splitter networks provide weak policy controls—the aggregate bit rate consumed along a path between two splitter nodes cannot be controlled and allocated to different classes of flows in a stream-aware fashion.
Yet another approach to avoid the problems of multicast is to have the content broadcast to several locations around a network and have the client run a test to determine the least congested path to a server having the content of interest. The client then connects to the server showing the least congested path the client. While is this good for file-centric applications, as opposed to stream-centric applications, this approach has drawbacks. For example, while the client might find a server with low congestion, little or nothing is done to ensure that the server closest to particular clients has the data that those clients are most requesting. Another problem is that many applications are live broadcasts and thus delivery of the data is time-sensitive and the data needs to be moved quickly to the edge servers that are serving the clients that are interested in the live broadcast while limiting the amount of network congestion that occurs on the network that is not bound for users interested in the broadcasts.