1. Field of the Invention
The present invention relates generally to interactive video delivery mediums such as interactive television. More particularly, the present invention relates to a system and method for the simultaneous transmission and rendition of multiple MPEG-encoded digital video signal streams in an interactive television application.
2. Description of Related Art
Interactive television is an interactive audio/video delivery medium which provides broadcast audiovisual content to a number of subscribers. Interactive television provides broadcast video and audio to users and may also provide a return path for the user to interact with the content, e.g., to make selections or order desired products, etc.
In a television broadcast by a television network, such as a broadcast of a bicycle race, the television network may generate multiple video feeds to the network from various angles of the race or from the various bicyclists, for example. The network may select one or more feeds from the multiple video feeds and broadcast the selected video feed(s) to the viewing audience at any given point in time. As is evident, each viewer does not have the option to individually select which video feeds are to be rendered simultaneously for viewing.
A point-to-point network can enable each viewer to select the video feeds to be rendered simultaneously from a set of available video feeds. In the point-to-point network such as in an on-line environment, each viewer may send a request to the head-end server selecting which video feeds the viewer wishes to view. The server may then recompose the screen for each viewer on the head-end and then sent it to the specific viewer. However, such a point-to-point network or on-line environment requires a significant amount of bandwidth as well as a return path from the viewer site to the head-end server in order for the viewer to send its video selections to the head-end server. Further, such a point-to-point network or on-line environment also requires additional hardware in the head-end server for picture re-composition for each active client.
Another system which can enable each viewer to select the video feeds that the viewer wishes to view from the set of available video feeds is a system having as many decoders in a receiver at the viewer site as individual videos to be rendered simultaneously. For example, if six individual videos are to be rendered simultaneously, the receiver at the viewer site must provide six decoders. However, such a system would require significant processing power in the receiver and increase the cost of the receiver. In addition, the number of videos that can be rendered simultaneously would be limited to the number of decoders provided in the receiver.
Thus, it would be greatly desirable to provide a relatively simple and cost effective system and method for the simultaneous transmission and rendition of multiple encoded digital video signal streams in an interactive television application such that each viewer may select its own set of one or more video feeds from a number of video feeds. Ideally, such a system and method would not require a significant amount of bandwidth or a return path from the viewer site to the head-end server.
MPEG Background
Background on MPEG (Moving Pictures Experts Group) compression is presented here in order to facilitate discussion and understanding of the present invention. MPEG compression is a set of methods for compression and decompression of full motion video images which uses interframe and intraframe compression techniques. MPEG compression uses both motion compensation and discrete cosine transform (DCT) processes, among others, and can yield compression ratios of more than 200:1.
The two predominant MPEG standards are referred to as MPEG-1 and MPEG-2. The MPEG-1 standard generally concerns inter-field data reduction using block-based motion compensation prediction (MCP), which typically uses temporal differential pulse code modulation (DPCM). The MPEG-2 standard is similar to the MPEG-1 standard but includes extensions to cover a wider range of applications. As used herein, the term “MPEG” refers to MPEG-1, MPEG-2, and/or any other suitable MPEG-standard compression and decompression techniques.
An MPEG stream includes three types of pictures or frames, referred to as the Intra (I) frame, the Predicted (P) frame, and the Bi-directional Interpolated (B) frame. The I or Intra frames contain the video data for the entire frame of video and are typically placed every 10 to 15 frames. Intra frames provide entry points into the file for random access, and are generally only moderately compressed. Predicted frames are encoded with reference to a past frame, i.e., a prior Intra frame or Predicted frame. Thus P frames only include changes relative to prior I or P frames. In general, Predicted frames receive a fairly high amount of compression and are used as references for future Predicted frames. Thus, both I and P frames are used as references for subsequent frames. Bi-directional pictures include the greatest amount of compression and require both a past and a future reference in order to be encoded. Bi-directional frames are not used as references for other frames.
An MPEG encoder divides respective frames into a grid of 16 by 16 pixel squares called macroblocks. The respective frames are divided into macroblocks in order to perform motion estimation/compensation. Each picture is comprised of a plurality of slices. The MPEG standard defines a slice as a contiguous sequence of 2 or more macroblocks (16×16 pixel blocks) that begin and end on the same row of macroblocks. A slice begins with a slice start code and includes information indicating the horizontal and vertical location where the slice begins in the picture.