Multimedia data and video programming are available in both analog and digital format over a variety of delivery services including cable, over-the-air broadcast, and the Internet. In a traditional analog distribution of a video signal, such as a television broadcast, a television set may receive and display the video signal nearly as soon as the tuner acquires the proper broadcast channel. Alternatively, in a digital broadcast, a television or Set-Top-Box (STB) decoder may need to wait after a channel change request until a particular reference frame, packet, or header from the newly selected channel is received prior to displaying any video signal. For the purposes of this disclosure, a video signal is one that contains a representation of a video output including a visual image and may include sound, closed caption text, and/or other associated information. Reproducing or displaying the video output may include displaying the visual image and emitting any corresponding sound information. When a display device such as a TV is connected to a digital network, such as one conforming to the Internet Protocol (IP), the channel changing time may take longer than for a TV that is connected to an analog network. In the case of a digital network, the channel changing time can be up to several seconds following the channel change. Such long delays can considerably lower the end user's quality of experience in comparison to their experience using an analog network.
FIG. 1 shows a traditional video delivery system 100 including a media source 102, a Set-Top-Box (STB) 104 that receives and decodes data from media source 102, and a display device 106, such as television (TV), that receives the decoded data and displays the media content for a user. Media source 102 may include a network, broadcasting, unicasting, or multicasting source comprising a source of video programming. Broadcasting may refer to a transmission that may be received by any receiver on a network, while multicasting may refer to a transmission that may be sent to or received by only members of a multicast group. Video stream data 108 may flow in a first direction, considered a downstream direction, from source 102 to TV 106. Message data 110 may flow in a second direction, opposite to the first direction, to provide interaction with one or more source servers within source 102. A bidirectional connection 112 provides for the exchange of video stream data 108 and message data 110 between source 102 and STB 104. Similarly, another bidirectional connection 114 provides for the exchange of video stream data 108 and message data 110 between STB 104 and TV 106. For a digitally encoded video stream, there may a channel change delay may occur for various reasons including acquiring program information, acquiring a reference frame, acquiring encryption information from the network, and/or factors associated with the way the video stream may be encoded before STB 104 can decode the new video stream for display on TV 106. A long channel change delay is one of the impediments in delivering acceptable video over IP networks, and laboratory tests have shown that delays within STB 104 are major contributors to the delay.
FIG. 2 shows a diagrammatic view of an exemplary channel change transition 200 for a traditional STB decoder. In reference to FIGS. 1 and 2, a traditional STB 104 is initially receiving a Cached Channel-A stream 202 and producing a current decoder output 204. Cached Channel-A stream 202 is a buffered video stream corresponding to a first channel and including a sequence of video data groups (206, 208) providing encoded data for display on TV 106 corresponding to a video program on Channel-A such as a movie, a commercial, or a video slide-show that may include moving pictures, static pictures, and/or sound. Similarly, Cached Channel-B stream 210 is a buffered video corresponding to a second channel and including a sequence of video data groups (212, 214) providing encoded data for display on TV 106 corresponding to a second video program on Channel-B such as a movie, a commercial, or a video slide-show that may include moving pictures, static pictures, and/or sound. Since STB 104 is currently decoding Channel-A stream 202, current decoder output 204 produces a video data output signal 216 corresponding to the data content of Channel-A stream 202. Typically, the groups of Channel-A stream 202 and Channel-B stream 210 may be asynchronous to each other.
In one type of system, media data in a channel group may include an independent reference data frame followed by a sequence of dependent data frames carried on a media data channel. For a typical STB to properly decode data in the group, the independent data frame must be decoded first before the associated dependent data frames may be decoded. An independent data frame may be decoded and displayed, whereas the subsequent dependent data frames must rely on a previously received independent data frame. If the independent data frame is positioned near the head of a group, then a STB must wait for the beginning of the next group received after the channel transition in order to decode data from the new channel. Since a traditional STB will typically include only one channel decoder to minimize cost, STB 104 may decode only one channel at a time.
Referring to FIGS. 1 and 2, if STB 104 is currently decoding Channel-A stream 202 and then receives a command at time T1 220 to make a channel change transition 222 to decoding Channel-B stream 210, decoder output 204 may transition to a blank output signal 224 or no-program output until an independent data frame is received from the newly selected channel. At a time T2 226, Channel-B data group 214 is arrives from Channel-B stream 210 and is decoded. Once the first independent data frame from Channel-B group 214 is decoded, STB current decoder output 204 will change to a Channel-B output signal 228 that corresponds to the media data carried by the first independent data frame from Channel-B group 214. In this manner, the undesirable blank or no-program output signal from STB 104 may occur during a time delay 230 as STB 104 waits to receive and decode the first independent data frame from Channel-B group 214.
FIG. 3 shows an exemplary decoder transition delay time chart 300 for a traditional video delivery system. In this example, a decoder would incur a best-case delay time ΔT1 302 when transitioning at a transition time 304 to a best channel-b 306 having a first independent data frame 308 arrive at a best arrival time 310. Similarly, a decoder would incur an average-case delay time ΔT2 312 when transitioning at a transition time 304 to an average channel-b 314 having a first independent data frame 316 arrive at an average arrival time 318. Finally, a decoder would incur a worst-case delay time ΔT3 320 when transitioning at a transition time 304 to a worst channel-b 322 having a first independent data frame 324 arrive at a worst arrival time 326. Some prior attempts to address the blank or no-program output have addressed this issue by sending a copy of an original reference frame or a dummy reference frame first to STB to speed up channel change time. However, one problem with this approach of sending the original I frame to STB is that there is a catch up time between cached stream and original stream. The catch up time could be up to an entire Group of Pictures (GOP) or frame set time. To catch up original stream, network would need to burst cached stream. This will cause a problem (require more bandwidth during bursting) in the “last mile” connection to the user STB when access bandwidth is typically constrained as in a Digital Subscriber Line (DSL) network.
The video stream can include video programming delivered according to a current or future video standard document, such as one of the family of standards promulgated by the Moving Picture Experts Group (MPEG) including MPEG-1, MPEG-2, and MPEG-4, the International Organization for Standardization (a.k.a. ISO), the International Electrotechnical Commission (IEC), and/or the International Telecommunications Union (ITU). In some cases, a particular standard may be published or reprinted by another standards body. For example, the publication ITU-T H.222.0 states, in pertinent part on page i, “The ITU-T Recommendation H.222.0 was approved on 27 May 1999. The identical text is also published as ISO/IEC International Standard 13818-1.” Further, the publication International Standard ISO/IEC 13818-2 states, in pertinent part on page v, “International Standard ISO/IEC 13818-2 was prepared by Joint Technical Committee ISO/IEC JTC 1, Information technology, Subcommittee SC 29, Coding of audio, picture, multimedia and hypermedia information, in collaboration with ITU-T. The identical text is published as ITU-T Rec. H.262.”. Therefore, the referenced documents ITU-T Rec. H.222.0 and ISO/IEC 13818-1 are identical to each other, and the referenced documents ITU-T Rec H.262 and ISO/IEC 13818-2 are identical to each other. The described standard reference documents H.222 and H.262 are hereby incorporated herein by reference.
Video streams typically include a sequence of frames each having a particular type, and typically including two or more pictures per frame. The frame types may include independent or Intra-coded frames (I-frames), Predictive-coded frames (P-frames), and Bidirectionally predictive-coded frames (B-frames). An I-frame is coded using information only from itself, and is the only frame type that contains enough information for a decoder to reconstruct a complete image. A P-frame is one where the pictures are coded using motion compensated prediction from a past reference frame or past reference field. A B-frame is one where the pictures are coded using motion predicted from either a past or a future reference frame. For both a P-frame and a B-frame, another frame of reference is needed to construct a complete and current decoded image. Hence, an I-frame is considered an independent frame, while P/B-frames are considered dependent frames.
In reference briefly to FIG. 1, to change the received video stream to a different stream, a user may operate TV 106 to select a new channel. In this example, STB 104 may receive a change request message from a user operating an STB remote control (e.g. an infrared remote) or from TV 106 via link 114. STB 104 processes the request and may send an Internet Group Management Protocol (IGMP) leave request in upstream direction 110 to the old multicast source in order to unsubscribe STB 104 from membership in the old multicast group associated with the previously received video stream. IGMP is defined in an Internet Engineering Task Force (IETF) document Request For Comments (RFC) 1112, commonly referred to as IETF-RFC 1112. Once the IGMP leave message is sent, an IGMP join message is sent to the associated new multicast source corresponding to the newly selected video channel in order to subscribe STB 104 for membership in the new multicast group. Once the IGMP join message is sent, STB 104 is reset and waits for the video traffic associated with the newly selected video channel to arrive. STB 104 buffers the video data from the newly selected video channel and then decodes and displays the newly selected video stream by sending the decoded video stream to TV 106. In this manner, the currently decoded video channel may be changed in a traditional digital video delivery network. According to the MPEG-2 standard, a video stream includes a sequence of frames that are encoded to provide significant compression. When switching channels, the first I-frame arrival time from the newly selected channel may be considered a random event since the previously selected channel and the newly selected channel may be asynchronous to each other.
The delay before the arrival of the next independent frame for a newly selected channel can be substantial, and may contribute to a poor user experience as a subscriber to an Internet Protocol Television (IPTV) network. For example, the delay can be from about 0.5 to 5 seconds, depending on the encoding; and may have a fixed or variable period. Accordingly, there is a need in the art for a method and system for reducing the channel selection transition delay in a digital network.
Embodiments of the present invention and their advantages are best understood by referring to the detailed description that follows. It should be appreciated that like reference numerals are used to identify like elements illustrated in the figures.