Historically, video data is transmitted in the radio frequency (RF) spectrum to a television or set-top box (STB). For example, a Cable Head-End might use quadrature amplitude modulation (QAM) to transmit digital video to its subscribers by modulating the video onto an RF signal that is up-converted and transmitted in the analog RF spectrum. The modulated video is formatted using MPEG-2 Transport Streams (MPEG-2 TS), and each home's television or STB tunes to a particular RF channel to view a program. On-demand content might also be modulated onto a QAM, and the STB tunes to a particular channel to view a program that was requested. In this type of network, the live broadcast content and the on-demand content is combined at the RF plant using an RF combiner. Different QAM devices are also generally used for video and data over cable service interface specification (DOCSIS) data, and similarly combined at the RF plant (i.e., DOCSIS uses QAM channels that are not used for video broadcast).
Newer internet protocol television (IPTV) deployments wrap the video in a transport layer (e.g., real-time transport protocol (RTP), transmission control protocol (TCP), user datagram protocol (UDP)) and then transmit the content using either a multicast or unicast across an IP data network (e.g., DOCSIS, very high bitrate digital subscriber line (VDSL), gigabit passive optical network (GPON), Ethernet, etc.). The use of IP multicast is common for live broadcast channels viewed by many, since it is similar in concept to broadcast television. However, in this usage of multicast, only the IPTV subscribers that request to join a particular multicast session will actually receive the transmission. IP unicast, on the other hand, is used for on-demand content, with a unique IP address used per subscriber. Unicast enables per-subscriber customization of live content, such as when performing targeted advertising.
FIG. 1 shows a video delivery network 100 with different video delivery systems 108a-z, each connecting to clients 112a-z through a distribution network 110. Clients 112a-z are able to access each video delivery system 108a-z through this network 100, as well as content available from the IP core network 106.
Content is ingested by each video delivery system 108a-z, typically from an IP network 106. For example, IP multicast may be used for broadcast television retrieved from a broadcast television encoder 102, while IP unicast is used for on-demand content retrieved from a content origin server 104. Content may be ingested and delivered using different protocols. For example, File Transfer Protocol (FTP), TCP and Hypertext Transfer Protocol (HTTP) may be used for content ingest, while RTP, UDP, TCP, and HTTP may be used for delivery. TCP and HTTP are beneficial for delivery across IP data networks, since they are reliable and support bursting of data (e.g., progressive download to a STB or personal computer (PC)).
Clients 112a-z (e.g., STB, PC) interact (e.g., using Real Time Streaming Protocol (RTSP), HTTP, etc.) with a video delivery system 108a-z by making content requests. A signaling session is set up between the client 112a-z and the video delivery system 108a-z, and requests for content are made by the client 112a-z. This includes, for example, picking titles and channel changing for live content, as well as pause, rewind, fast forward, play, etc., for stored content. Once a video delivery system 108a-z has established a signaling session with a client 112a-z, some video delivery systems 108a-z will provide a proxy function to maintain the session when the requested content is available elsewhere in the network (i.e., a local cache “miss”). In this case, the system 108a-z makes the content appear as though it had originated locally, even though coming from another device in the network. Implementing the proxy function may also require a gateway between different protocol sessions (e.g., fetching content using HTTP and delivering content using UDP). In some cases, a redirect is performed, creating a new session with another system 108a-z containing the desired content.
For stored content delivery, a video delivery system (e.g., 108z) may use a playlist to access the stored content. For example, this might be used for playing a sequence of chapters in a movie. A playlist is any ordered list of content or references to content that can be accessed and played over a given time period. A playlist may be a list of songs, videos, multimedia, or other content. Playlists have long been used for music, e.g., dating back to radio broadcasting and, more recently, for multimedia content that can be played on a PC. Typically, a playlist is constructed with multiple references to one or more pieces of content, starting at the beginning and continuing until the end is reached. Playlists have many formats and are used in many different applications, including computerized media players, e.g., Windows Media Player from Microsoft Corporation of Redmond, Wash., or Adobe Flash Player from Adobe Corporation of San Jose, Calif., among others. Typically, a playlist is executed by retrieving the media data (from local storage, network storage, etc.) and running a program that actually plays the media. Often, the stored media is in a compressed format (e.g., MPEG-2, Advanced Video Coding (AVC), MPEG-1 Audio Layer 3 (MP3), VC-1, etc) and the program that plays the content decodes the media before presenting it.
At times, subscribers may want to switch between different types of content (e.g., from a live broadcast to on-demand programming). Content switching is generally performed between different systems, resulting in new sessions being created at each content switch. This generally includes switching between live and stored content, such as between broadcast television and timeshift TV, as well as to proxy content that is fetched from elsewhere in the network. A video delivery system (e.g., 108a) connected to a client (e.g., 112a) via a signaling session may be capable of supporting live television broadcast delivery as well as stored content delivery (e.g., Video On-Demand (VoD), Timeshift TV, Network Personal Video Recorder (nPVR)). However, it is more often that different video delivery systems 108a-z will be used for different applications, such as one for live content delivery 108a (e.g., fast channel change) and another for stored content delivery 108z. Having different systems 108a-z for each type of content delivery requires creating a new session between the client 112a-z and each video delivery system 108a-z at each content switch. For example, a subscriber at client 112a may be watching live television and then select a time-shifted version of the same program. If the live content is provided by one system (e.g., 108a) and the stored timeshift TV content is provided by another system (e.g., 108z), then a new signaling session is needed between the client 112a and the timeshift system 108z. This process may take several seconds to complete, since one session is taken down and another is created, and typically involves another application on another system that then directs where the content session requests should be made.
Content switching may also include fetching proxy content that is stored elsewhere in the network 100. In a system 108a-z that is capable of single session proxy, where the proxy content being delivered is consistent with any locally served content, access requests to the proxy content may be included in a playlist. The proxy content may be available in the same protocol as what is being delivered, or the content may be available in a different protocol. If the content is available in the same protocol, a Network Address Translation (NAT)-like function (UDP in and UDP out) may be performed. If the content is available in a different protocol, a more sophisticated gateway function may be needed to convert between protocols (e.g., HTTP to UDP). Typically, the video delivery system 108a-z will conduct this conversion by ingesting the proxy content, storing it locally, and then delivering it to a client 112a-z as stored content. Depending on the caching effectiveness of the system 108a-z, this may require significant ingest and storage bandwidth.
Playlists may be used to control content switching between different content sources, even when the sources have different access latency. For example, a switch between a live broadcast television program, stored content, and proxy content stored elsewhere in the network may be done, although each switch may complete at a different time. This is due to each content type potentially having a different access delay. A seamless content switch requires each content type be available at a precise time so there are no gaps in the resulting output stream and no overlap of content.
Another example of content switching occurs in the context of advertisement (ad) splicing in live broadcast streams. Ad splicing is a common function supported by a video delivery system 108a-z. Spliced advertisements may be stored on the video delivery system 108a-z performing the splice or on an ad server (not shown). When a splice is to be performed, a splice point is often indicated in the live video stream. The system 108a-z performing the splice has a fixed window of time to fetch the ad and perform the splice (e.g., four-second countdown that is signaled in live broadcast feed). If the ad is available in time, then a switch is made between the live stream and the ad stream being read from storage. If the ad cannot be fetched quickly enough, then the splice is aborted and the live broadcast continues unchanged, since there is typically a default ad already in the spot. This is possible since the splicer generally has access to both input streams simultaneously, and can choose in real-time which to deliver as its output.
An input queued system may have many queues with content available for writing to an output buffer. A priority or fairness algorithm can be used to determine which input queue to read from as there is space available in an output buffer. Generally, content that is part of the same flow and read from different input queues requires some means to reorder the content before delivery, increasing complexity and the amount of bandwidth required at the output buffer. When input queues are used for content switching, controlling the order in which the input queues are accessed is essential to composing an output stream in the output buffer, and further reducing the output buffer bandwidth and complexity. By insuring that there are no gaps or overlap in input queue access, the output buffer input bandwidth may be matched to its output bandwidth and be written in first-in first-out (“FIFO”) order.
Controlling the precise time for input queue selection, when performing content switching, requires a mechanism to synchronize the input queue access with the actual content. For example, a seamless content switch may require two pieces of content be spliced back to back in an MPEG-2 TS compliant manner. This would be needed to avoid display artifacts that might occur from a partially constructed output stream. The input queue selection may be done based on time if the system is precise enough, e.g., switch at 12 o'clock, based on a content data length if this is known before hand, e.g., switch after 1,000 bytes, and by examining the content stream if a command can be issued at the appropriate time, e.g., look for the next video out-point and then switch. These approaches each have issues, since it is not always known when a content switch must occur, for example, when performing an ad splice.
Content switching in a system with variable latency may require a means to handle cases where the content cannot be accessed at the needed time. A system using a playlist and issuing multiple commands in the playlist in parallel has an expectation that the commands will be fulfilled in time. If a command in the playlist cannot be completed because the content is unavailable, then an error may be signaled and an interruption in content delivery may occur. For example, a playlist is constructed ahead of time in the splicing countdown window of a live stream so an advertisement can be pre-fetched and then inserted when the splicing point is reached. Since the playlist is generated ahead of knowing whether the content read will succeed or not, any content switch to content that is unavailable causes reading of the live stream to stop, and no content to be sent to a client. In the case of an advertisement being unavailable for delivery, this may result in one or more freeze frames, as well as display and audio artifacts that are undesirable, until the system recovers. This may take several seconds, especially if the playlist is aborted and a new playlist must be created for new content.
A scalable system must support many content streams in parallel. Each stream may have different attributes and different access patterns. Keeping each stream's content requests separate from others is essential to scale and uniform content delivery (i.e., no gaps, skipped play, etc).