Video that is transmitted in the radio frequency (RF) spectrum (e.g., distributed from a cable head-end) may use Quadrature Amplitude Modulation (QAM), or for terrestrial broadcasts (e.g., Advanced Television Systems Committee (ATSC) over air in the US) 8-Vestigial Sideband Modulation (8VSB), or Coded Orthogonal Frequency Division Multiplexing (COFDM) in Europe. Both convert the digital video into a modulated RF signal that is up-converted and transmitted in the analog RF spectrum. For example, 256-QAM has approximately a 5.1 MHz symbol rate at 6 MHz, with each symbol capable of representing 8 bits of information. This means a 256-QAM channel is able to transmit approximately 40 Mbps of digital data information within the 6 MHz RF Channel (note that Europe uses 8 MHz channels). Including noise reduction methods, such as Forward Error Correction (FEC), this translates into roughly ten (10) 3.75 Mbps digitally compressed video programs that fit within the 6 MHz channel bandwidth, instead of just a single analog program. The modulated digital video is formatted using MPEG-2 (for Moving Picture Experts Group) Transport Streams (MPEG-2 TS), and each home's television or set top box (STB) that is capable of receiving the transmission tunes to a particular RF channel to decode a program.
The RF spectrum limits the number of unique 6 MHz channels that are available for transmitting digital data and video. As a result, this limits the number of unique video programs that can be broadcast (or transmitted) at any one time to homes, especially when sharing the same Hybrid Fiber Coaxial (HFC, as in a Cable deployment) since they all would share the same RF spectrum. This limitation is also true for homes sharing the same passive optical network (PON, such as a gigabit PON, or GPON) in a Telco deployment (typically a single wavelength is used for an RF overlay). Finding ways to reclaim the analog television RF spectrum is a high priority for cable providers. This means looking at Switched Digital Video (SDV) approaches to selectively provide more digital content, as well as an effort to move toward an Internet Protocol Television (IPTV) infrastructure where the video is transported using IP data networks (e.g., Data Over Cable Interface Specification (DOCSIS)). In a cable deployment, QAM can still be used for transmitting digital data (i.e., DOCSIS), while in a Telco deployment Very high bit-rate Digital Subscriber Line (VDSL) and Passive Optical Networking (PON, such as a B/GPON) may be used. Each solution transports Ethernet frames and provides access to an IP network.
In IPTV networks, video data is typically wrapped in a transport layer (e.g., Real-time Transport Protocol (RTP), Transmission Control Protocol (TCP), User Datagram Protocol (UDP)) and then multicast or unicast across the network. An IP packet would generally contain up to seven (7) 188-byte MPEG-2 TS packets, or some number up to the Maximum Transmission Unit (MTU) of the network. IP multicast is common for distributing live content broadcast channels that are viewed or consumed in real-time by multiple subscribers, where IP unicast is used for per-subscriber delivery. For broadcast channels, each has a unique IP multicast address that devices (e.g., a STB) “join” (e.g., using the Internet Group Management Protocol (IGMP)) in order to access a channel or program. Per-subscriber delivery uses a separate IP unicast address for each device. This allows viewing of content by a single device, such as for accessing on-demand content and for personalized broadcast video (e.g., with per-subscriber advertisement placements).
An MPEG-2 TS is typically used for transporting digital live content since MPEG TS includes timing information (e.g., a Program Clock Reference (PCR)) that creates a synchronous relationship between a real-time encoder and decoder (e.g., STB). When the content (e.g., a television program) is fed into an MPEG encoder, the process produces a Single Program Transport Stream (SPTS) containing audio and video data. The SPTS is comprised from Packetized Elementary Streams (PES) containing separate audio and video Elementary Streams (ES). A video ES may be coded as MPEG-2 or H.264/AVC (for Advanced Video Coding), depending on the requirements of the service provider. Typically, one or more audio PES is included in an SPTS (e.g., for multiple audio PES, each for a particular language) along with a video PES. Data may also be carried in the MPEG-2 TS, such as for Program Specific Information (PSI), Electronic Program Guide (EPG), and advanced services.
Once the SPTS is created at the encoder, the SPTS may optionally be fed into a multiplexer which inputs multiple SPTSs and creates a multiplex referred to as a Multi Program Transport Stream (MPTS). When an MPTS is received by a device, the device reads the PSI and the Program ID (PID) in each 188-byte MPEG-2 TS packet to demultiplex the stream. The PSI associates each program's content with a PID, which the device uses to selectively extract the audio and video PES from the MPTS that the device uses to display (e.g., on a TV) or store (e.g., on a DVR).
The SPTS and MPTS may be transported as Constant Bit Rate (CBR) or as Variable Bit Rate (VBR) depending on the requirements of the distribution network (e.g., Edge Quadrature Amplitude Modulation (EQAM) in a Multiple System Operator (MSO) may require CBR) and the device decoder. For CBR delivery, MPEG-2 TS “Null” packets may need to be added to the data stream in order to maintain a constant bit rate. An MPTS multiplexer may also need to reduce the bit rate of one or more SPTS streams (i.e., clamping or trans-rating) when the coincident rate of the combined streams exceeds the target transport capacity (e.g., QAM, GbE, etc.). This operation is performed at the Elementary Stream (ES) level, and involves modifying Discrete Cosine Transform (DCT) coefficients, variable length codes, removing coded Blocks, and skipping Macroblocks, etc. Processing at the ES level is generally considered an expensive operation, to be performed as necessary.
The coding standards for Digital Television involve inter-picture coding (ISO MPEG-2 and H.264/AVC, Microsoft VC-1, etc.) for higher coding efficiency, typically requiring one or more video pictures (interchangeably called frames) to decode many of the other pictures. The transmission order and the display order of an inter-picture coded Group of Pictures (GOP) is generally different, since B-pictures are bi-directionally predicted from a future picture (and possibly the next GOP). For example, the first two (2) B-pictures of a GOP may be backward predicted from the first I-picture (e.g., no forward prediction from a prior GOP). Such a GOP ends at the last P-picture, with no references made to that P-picture from a future GOP. The first I-picture needs to be sent and decoded first so that the I-picture can be used to decode the following B-pictures. Typically, I-pictures tend to be bigger than the size of P-pictures and P-pictures tend to be bigger than the size of B-pictures. If all of the SPTS streams are aligned at delivery (e.g., all the I-pictures are sent at the same time), then the bandwidth that is allocated to the transmission medium needs to be high enough to support the peak rate of all the I-frames.
In IPTV deployments, often times only the selected video (e.g., video that a subscriber has selected to watch) is sent to the device (e.g., to a device associated with the subscriber) over an IP data network (e.g., DOCSIS, VDSL, GPON, Ethernet, etc.) to the STB, making it possible to customize each viewing experience. This applies to both stored content (e.g., on-demand and timeshift TV) as well as live content broadcasts. In the case of live content broadcasts, multiple viewers may be watching the same program, but advertisements at commercial breaks may be customized based on, for example, geography and the learned preferences of each subscriber (e.g., per-subscriber advertising).
Live content broadcast video is synchronous by nature, since programs tend to start and stop at predetermined times (e.g., time according to the wall clock). For example, networks (ABC, CBS, NBC, etc.) satellite broadcast programs to local affiliates, which then rebroadcast the content. Channel changing is often clustered at the start and end of programming, since people tend to look for “what else is on,” or what content is on other channels. Advertisements are generally placed at predetermined times, even across multiple television channels (e.g., at the beginning and the end of programs and at fixed intervals within the programs). There is also the occasional unexpected content (e.g., breaking news) or unusual event that occurs (e.g., an event not typically shown again by a network because that event is “not appropriate”) that may also cause a large number of subscribers to simultaneously perform a Timeshift TV rewind. All these events, some separate and some combined, tend to create sharp spikes in subscriber activity at nearly the same time due to the synchronous nature of time driven broadcast.
IPTV networks supporting a unicast model of delivery observe the high correlation between program and time by receiving signaling events (e.g., from a STB) from each subscriber (e.g., using Real Time Streaming Protocol (RTSP)), nearly at the same time. These events may occur, for example, due to a channel change (e.g., channel changing at the top and bottom of the hour when channel surfing, looking for what else is on, primetime programming, etc.). The events may also occur during ad splicing (e.g., at the start and end of a program, predetermined intervals within program, etc.). The events may also occur during a focused rewind (e.g., something unusual or unexpected happens on popular content and everyone wants to see or hear the event, such as in Timeshift TV when everyone rewinds at same time, etc.)
Stored content selection can be naturally distributed over a larger time frame (e.g., due to different STB request times, since each user may request the stored content for viewing at unrelated times based on personal preference), and content access is generally more tolerant of delivery delay. However, in a live content unicast delivery system, where each subscriber receives a separate video stream (e.g., to their STB), possibly the same or different channel, the signaling requests from the subscribers (e.g., changing channels) can tend to be clustered in a very small time window as a result of the natural alignment of program delivery. The video delivery system is burdened with receiving and processing all of the requests in a very short window of time. This sharp peak in the request rate will often subside once subscribers find the programming they desire, and the signaling requests will fall to a relatively low number. This makes the peak to average ratio of request very high, and may result in a need for the video delivery system to be sized to handle the “peak” signaling events, or possibly increase the video delivery system's processing delay (e.g., a change channel may take very long time to happen during peak signaling events, while not long during average signaling events).
Fast channel change performance can be especially important in a unicast video delivery system, since the request to change channels is sent back to the video delivery system, rather than at the STB (e.g., by tuning to a different QAM RF). The video delivery system's ability to respond quickly is essential to the utility of the solution. If the number of channel change requests becomes too high during the same period in time, the signaling control may not be able to respond quickly enough to all requests.
Per-subscriber ad splicing in live content broadcast streams is similarly exposed to the synchronous nature of television, since ad placements tend to be located at the start and end of programs and at fixed intervals therein. This is generally also true across different television channels, resulting in the simultaneous access to stored advertisements that may or may not reside on the same video delivery system (e.g., each ad placement contends for access to the same storage device). If the number of simultaneous requests to storage exceeds the storage bandwidth capacity, such a condition may result in a video underflow or a missed ad placement. Further, there is a limited time window for determining an ad placement (e.g., a four second count down that is signaled in a live content broadcast feed), that compounds the control plane problem when all ad placements coincide in time.
Timeshift TV is an application where a live content program is broadcast in real-time and is also simultaneously stored. This allows a subscriber to “pause” live TV, rewind, and fast forward to “catch up” to live. In the case of many subscribers watching the same live content program, and something happens that causes everyone to rewind, this creates a problem similar to ad splicing, in that there are many highly correlated requests to access the storage system (to see something in the past that is stored). This can overwhelm both the control plane, as in channel changing, and the storage system (random access to multiple places in file).