1. Technical Field
The disclosed invention relates to program navigation in compressed digital video delivery systems such as cable television (CATV), satellite television, Internet protocol television (IPTV) and the Internet based video distribution systems. In particular, it relates to the use of a low-delay and layered codec and the corresponding low-delay transport, typically used for videoconferencing systems, in connection with digital video delivery systems to enhance the navigation capabilities and user interface of Electronic Programming Guides (EPGs)/Interactive Programming Guides (IPGs) by showing audio-visual content of the actual program in addition to meta-information about the audio-visual content, such as, for example, the program's title, description, genre of video content, channel, or time.
2. Background Art
Subject matter related to the present application can be found in co-pending U.S. patent application Ser. Nos. 12/765,815, filed Apr. 22, 2010 and entitled “An Efficient Video Skimmer”; 12/765,793, filed Apr. 22, 2010 and entitled “Systems, Methods and Computer Readable Media for Instant Multi-Channel Video Content Browsing in Digital Video Distribution Systems”; 12/765,767, filed Apr. 22, 2010 and entitled “Systems, Methods and Computer Readable Media for Instant Multi-Channel Video Content Browsing in Digital Video Distribution Systems”; 61/264,466, filed Nov. 25, 2009 and entitled “IPTV Presence And Interaction Protocol”; and 61/289,249, filed Dec. 22, 2009 and entitled “System And Method For Interactive Synchronized Video Watching”; 12/015,956, filed Jan. 17, 2008 and entitled “System and Method for Scalable and Low-Delay Videoconferencing Using Scalable Video Coding”; 11/608,776, filed Dec. 8, 2006 and entitled “Systems and Methods for Error Resilience and Random Access in Video Communication Systems”; and 11/682,263, filed Mar. 5, 2007 and entitled “System and Method for Providing Error Resilience, Random Access and Rate Control in Scalable Video Communications”; and U.S. Pat. No. 7,593,032, filed Jan. 17, 2008 and entitled “System and Method for a Conference Server Architecture for Low Delay and Distributed Conferencing Applications”. All of the aforementioned related applications and patents are hereby incorporated by reference herein in their entireties.
An Electronic Programming Guide (EPG), alternatively known as an Interactive Programming Guide (IPG) or Electronic Service Guide (ESG), is an application that can be used, for example, in connection with CATV, satellite television, digital video recorders, and TV set-top boxes to view a list the current and scheduled programs that are or will be available on each TV channel. An EPG can be viewed as an electronic equivalent of a traditional print or paper TV guide.
An EPG is displayed on a video display, for example, a TV or computer monitor, and can be controlled through an input device, such as a TV remote control, computer mouse, keyboard or other input device. Menus are provided to allow the TV user, alternatively known as a TV viewer, to navigate, select, and discover data about video content (alternatively known as video programs/programming, programs, or audio-visual content) including, for example, the program's channel, time, title or genre, and even view a list of programs scheduled for the next few hours or the next several days. A typical EPG can include options to, for example, set parental controls, to order pay-per-view programs, search for programs based on theme or category, or set recordings for the future when combined with a recording system. EPG can be used for scheduled broadcast TV as well as Pay per View (PPV) and Video on Demand (VoD) services.
By navigating through an EPG, users can see more information about the current programs on TV channels and about future and past programs. When EPGs are connected to a recording system, such as personal video recorders (PVRs), they enable a viewer to plan his or her viewing and record broadcast programs to a hard disk or network based storage for later viewing.
The on-screen information presented by the EPG can be delivered by a dedicated channel or assembled by the receiving equipment from information sent along with content by each individual program channel.
In the analog domain, EPG standards were developed by the European Telecommunications Standards Institute (ETSI) using a protocol infrastructure based on Teletext (ETS 300 707/708). In traditional Digital Video Broadcast (DVB), EPG information supports programming metadata as defined in the Specification for Service Information in DVB Systems (ETSI ETS 300 468) as a pure push service. The service discovery, selection and description for DVB services over IP can be a push or a pull service. Recently, ETSI has been actively developing IPTV (DVB-IPTV) standards which also include metadata definitions for a broadband content guide (DVB-BCG, ETSI TS 102 539).
An EPG can include a graphical user interface (GUI), which enables the display of channel names and program start times and titles, as well as additional descriptive information such as a program's synopsis, actors, directors, year of production, etc., which is particularly useful in VoD and PPV services.
EPG information can be displayed as a timetable that displays program titles on a two-dimensional grid, where the two axes represent channel name and time (e.g., hour/minute). The EPG can provide the user with the following exemplary options:
(1) select more information on each program by clicking on the program title using an input device,
(2) directly switch to a program on a channel,
(3) switch to a future program when it starts, or
(4) set a recorder to record the selected program.
The user can scroll up and down within the EPG to see information about programs available on additional channels (assuming channel is the vertical axis) or can scroll sideways to see information about programs available at different periods of time (assuming time is the horizontal axis). The EPG can allow the viewer to browse program summaries, search by genre or channel, and have immediate access to the selected program. More advanced EPGs can include alphanumeric search capabilities.
The EPG for VoD or PPV services can have a GUI that is different than the two dimensional grid used to display information for broadcast channels, such as classification by type or genre (e.g., comedy, drama, action, series, etc.), or alphabetical ordering of the available video content by title.
The latest evolution in EPGs is a personalized EPG which uses semantics to recommend video programs to one or multiple users based on their interests. A personal EPG can allow a user to use or create custom skins (like a desktop image on a personal computer) and knows what a user likes to watch based on previous choices and relevant feedback collected from the user. It can also facilitate the recording of these programs so that the user no longer has to depend on a broadcaster's time schedule, but can watch a program at the time of the user's choice, known as time shifting.
EPG data can be sent within the broadcast transport stream, or alongside in a special data channel. Many EPG systems, however, rely upon third party “metadata aggregators” to provide good quality data content. Newer media centers (e.g., personal computer based multi-channel TV recorders) and DVRs can use the Internet to provide a feed for the EPG data. This enables two-way interactivity such that the user can request media download via the EPG or a related link.
EPGs are widely used in satellite and CATV systems to guide users to TV channels without the use of paper. Although EPGs have become more sophisticated by providing significant amounts of metadata and menu options to display different levels of information about video content (e.g., the channel, time, title and genre of video content, a list of programs scheduled in the future, as well as the ability to, for example, set parental controls, order pay-per-view programs, search for programs based on theme or category, or set recordings for the future when combined with a recording system), most of this information is in textual format and, in some cases, disjointed from the actual audio-visual content being shown. Stated differently, most EPGs currently do not carry the actual audio-video content—only separate meta-information about the content—which many users perceive as inadequate to describe the details of the program. The program listing is prepared, provided, and often downloaded to the user's infrastructure well in advance by the channel/content owner; hence real-time changes in actual programming may not be reflected in the program metadata.
In developing the EPG software, manufacturers include functions to address the growing volumes of increasingly complex data associated with TV programming. This data can include, for example, program descriptions, schedules, and ratings, and user configuration information, such as favorite channel lists, multimedia content, and parental controls. To meet this need, some set-top box software designs incorporate a “database layer” that utilizes either proprietary functions or a commercial, off-the-shelf embedded database for sorting, storing, and retrieving programming data.
Incorporating the “content surfing” experience into an EPG by carrying the video content along with the EPG requires sending lower resolution versions of the video content to fit into Mini Browsing Windows (MBWs). Briefly put, an MBW is a small window in a GUI which can show motion video of a program. While size, number, and location of MBWs on the screen can be user-defined, this invention envisions the simultaneous presentation of multiple MBWs. As the number of broadcasting channels and amount of available video content grow to very large numbers, it becomes impossible to carry the entire set of programs of interest to everybody, as in the digital video distribution architecture of CATV.
Traditional video codecs used in CATV or IPTV systems (e.g., MPEG-2 main profile) are designed with single layer coding, which provides only a single bitstream at a given bitrate. If a lower spatial resolution is required (such as for the smaller frame size of an MBW), the full resolution signal is first received and decoded at the receiving end, followed by a sub-sampling operation to produce a lower resolution version, thus wasting significant bandwidth and computational resources. In particular, if an Active Video EPG (AVEPG) were to enable the preview of n channels, using traditional, full resolution video, the network capacity to the receiver running the AVEPG would have to be n times as high. Some prior art systems mix down-scaled video content at the sender site, and convey a single full resolution video signal to the receiver, which looks somewhat like an AVEPG. However, without requiring an unreasonably high amount of resources at the sender side, this technique does not allow for a per-user configuration of the AVEPG; that is, user preferences cannot be accommodated, or can only be accommodated to a limited extent. The invention disclosed overcomes both limitations.
Increases in video-program-switching (alternatively known as channel surfing) times also make program changing within an AVEPG more difficult. Digital video codecs, alternatively known as digital video coding/decoding techniques, (e.g., MPEG-2, H-series codecs such as H.263 and H.264 baseline or main profile, and packet network delivery) have increased program-switching times primarily for the following two reasons:
(1) Transport Delays: These delays result from buffering at the decoder at the receiving end, which is necessary to alleviate the effects of: (a) delay jitter caused by varying queuing delays in transport network switches; (b) packet losses in the network; and/or (c) bandwidth changes in the transport network (such as variable link bandwidths experienced in wireless networks).
(2) Encoding Delays: To display a video, the decoder at the receiver must receive an I-frame, alternatively known as an intra-coded frame, from the encoder before a video can be decoded (techniques avoiding I-frames, such as gradual decoder refresh, imply even longer delays). In broadcast environments, the time distance between I-frames in a bitstream is fixed (for example, 0.5 sec or more) to improve coding efficiency. Therefore, when a user changes a program, it can take as long as 0.5 seconds or more before the receiver can start decoding the audio-visual content. Furthermore, the encoders used in TV systems use “future frames” as well as “previous frames” as references to efficiently compress the current frame. As such, the decoder must wait for both the I-frame and the future reference frames to arrive so that the frames are generated in the correct sequence, causing inherent delays in the instant display of a new program.
In contrast to satellite and CATV systems, IPTV and other packet network-based video distribution systems also incur annoying transport delays, which can be significant—in the order of seconds to tens of seconds. In the evolving IPTV environment, the channel-change time has become significantly longer particularly when channels are delivered over a best effort network, such as the Internet, where the network conditions are completely unpredictable.
Layered coding or scalable coding is a video compression technique that has been developed explicitly for heterogeneous environments. In such codecs, two or more layers are generated for a given source audio-visual signal: a base layer and at least one enhancement layer. The base layer offers a basic representation of the source signal at a reduced quality, which can be achieved, for example, by reducing the Signal-to-Noise Ratio (SNR) through coarse quantization, using a reduced spatial and/or temporal resolution, or a combination of these techniques. The base layer can be transmitted using a reliable channel, i.e., a channel with guaranteed or enhanced quality of service (QoS). Each enhancement layer increases the quality by increasing the SNR, spatial resolution, or temporal resolution, and can be transmitted with reduced or no QoS. In effect, a user is guaranteed to receive a signal with at least a minimum level of quality of the base layer signal.
Accordingly, there exists a need for techniques for transmitting audio-visual signals using a low-delay and layered codec and the corresponding low-delay transport to enable AVEPG displays with specific requirements, i.e., multiple resolutions and rapid program switching.