Streaming media is multimedia that is constantly received by, and normally presented to, an end-user (using a client) while it is being delivered by a streaming provider (using a server). Several protocols exist for streaming media, including the Real-time Streaming Protocol (RTSP), Real-time Transport Protocol (RTP), and the Real-time Transport Control Protocol (RTCP), which are often used together. The Real Time Streaming Protocol (RTSP), developed by the Internet Engineering Task Force (IETF) and created in 1998 as Request For Comments (RFC) 2326, is a protocol for use in streaming media systems, which allows a client to remotely control a streaming media server, issuing VCR-like commands such as “play” and “pause”, and allowing time-based access to files on a server.
The sending of streaming data itself is not part of the RTSP protocol. Most RTSP servers use the standards-based RTP as the transport protocol for the actual audio/video data, acting somewhat as a metadata channel. RTP defines a standardized packet format for delivering audio and video over the Internet. RTP was developed by the Audio-Video Transport Working Group of the IETF and first published in 1996 as RFC 1889, and superseded by RFC 3550 in 2003. The protocol is similar in syntax and operation to Hypertext Transport Protocol (HTTP), but RTSP adds new requests. While HTTP is stateless, RTSP is a stateful protocol. A session ID is used to keep track of sessions when needed. RTSP messages are sent from client to server, although some exceptions exist where the server will send messages to the client.
RTP is usually used in conjunction with RTCP. While RTP carries the media streams (e.g., audio and video) or out-of-band signaling (dual-tone multi-frequency (DTMF)), RTCP is used to monitor transmission statistics and quality of service (QoS) information. RTP allows only one type of message, one that carries data from the source to the destination. In many cases, there is a need for other messages in a session. These messages control the flow and quality of data and allow the recipient to send feedback to the source or sources. RTCP is a protocol designed for this purpose. RTCP has five types of messages: sender report, receiver report, source description message, bye message, and application-specific message. RTCP provides out-of-band control information for an RTP flow and partners with RTP in the delivery and packaging of multimedia data, but does not transport any data itself. RTCP is used periodically to transmit control packets to participants in a streaming multimedia session. One function of RTCP is to provide feedback on the quality of service being provided by RTP. RTCP gathers statistics on a media connection and information such as bytes sent, packets sent, lost packets, jitter, feedback, and round trip delay. An application may use this information to increase the quality of service, perhaps by limiting flow or using a different codec or bit rate.
One of the techniques for achieving high scalability is using cache proxies that are distributed near the network endpoints. Such network cache proxies are known as a Content Delivery Network (CDN) or Edge Cache Network (ECN). A CDN is a network of tiered cache nodes that can be used to distribute the content delivery. A CDN is most commonly used to reduce the network bandwidth, reduce the load on the origin server(s), and increase the response time of content delivery. A CDN tries to accomplish these objectives by serving the content from a cache node that is closest to the request that has requested the content. Each caching layer serves as a “filter” by caching and serving the requested content without having to go to the origin server (such as a web server) for every request. The Internet has built up a large infrastructure of caching proxies (and network routers with caching capabilities) that are effective at caching data for HTTP. Servers can provide cached data to clients with less delay and by using fewer resources than re-requesting the content from the original source. For example, a user in New York may download a content item served from a host in Japan, and receive the content item through a caching proxy in California. If a user in New Jersey requests the same file, the caching proxy in California may be able to provide the content item without again requesting the data from the host in Japan. This reduces the network traffic over possibly strained routes, and allows the user in New Jersey to receive the content item with less latency.
While this solution works for on-demand content, the same solution does not work for live content because live content is not available in cache proxies (by the virtue of being live content). So for streaming a live broadcast, such as watching a live NFL game in real-time, CDNs/ECNs cannot leverage their HTTP caching proxies, because a cached response is not available at the time users request the data. Instead, CDNs/ECNs deploy and manage proprietary media streaming servers, which significantly increases the cost of the solution to content providers. Attempting to deliver live broadcasts with existing caching proxy solutions would have the following effects. When the request arrives at the edge cache server, it results in a cache-miss. The edge server forwards the request to the parent cache server and the request, again, results in a cache-miss. The result is that the request reaches the origin server to get the content.
In a live broadcast scenario, if 10 million users are requesting to view, for example, the opening ceremony of the Olympics, then there are 10 million cache misses, and all 10 million requests will be forwarded to the origin server, which, without the capacity to handle the 10 million requests, will crash or at least perform very poorly. Networks of proprietary streaming servers managed by CDNs/ECNs are limited in scale relative to networks of caching proxies, which often results in a lack of capacity for events with large viewership, such as the 2009 U.S. Presidential Inaugurations. Thus, the lack of caching limits the number of contemporaneous viewers and requests that the servers can handle, and thereby limits the attendance of a live event. The world is increasingly using the Internet to consume live information, such as the record number of users that watched the opening of the 2008 Olympics via the Internet. The limitations of current technology are slowing adoption of the Internet as a medium for consuming this type of media content.