Multimedia data such as video and music can be transmitted by multiplexing several programs together on a single channel, as is done in, e.g., broadcast digital multimedia using MPEG standards.
Specifically, multimedia can be formatted in accordance with Moving Pictures Expert Group (MPEG) standards such as MPEG-1, MPEG-2 (also referred to as DVD format), and MPEG-4. Essentially, for individual video frames these multimedia standards use Joint Photographic Experts Group (JPEG) compression. In JPEG, the image of a single frame is divided into small blocks of pixels that are processed by a discrete cosine transform (DCT) function to transform the spatial intensity values represented by the pixels to spatial frequency values, roughly arranged, in a block, from lowest frequency to highest. Then, the DCT values are quantized, i.e. the information is reduced by grouping it into chunks by, e.g., dividing every value by 10 and rounding off to the nearest integer. Since the DCT function includes a progressive weighting that puts bigger numbers near the top left corner of a block and smaller numbers near the lower right corner, a special zigzag ordering of values can be applied that facilitates further compression by run-length coding (essentially, storing a count of the number of, e.g., zero values that appear consecutively, instead of storing all the zero values). If desired, the resulting numbers may be used to look up symbols from a table developed using Huffman coding to create shorter symbols for the most common numbers, an operation commonly referred to as “variable length coding”. In any case, a JPEG-encoded stream represents horizontal lines of a picture, in much the same way as the underlying pixel data is arranged in a matrix of horizontal rows.
It will be appreciated that JPEG compression results in lost information. However, owing to the phenomenon of human perception and the way that the above process works, JPEG compression can reduce data representing a picture to about one-fifth of its original size with virtually no discernable visual difference and to one-tenth of its original size with only slight visual degradation.
Motion pictures add a temporal dimension to the spatial dimension of single pictures. Typical motion pictures have around twenty four frames, i.e., twenty four still pictures, per second of viewing time. MPEG is essentially a compression technique that uses motion estimation to further compress a video stream.
MPEG encoding breaks each picture into blocks called “macroblocks”, and then searches neighboring pictures for similar blocks. If a match is found, instead of storing the entire block, the system stores a much smaller vector that describes the movement (or not) of the block between pictures. In this way, efficient compression is achieved.
MPEG compresses each frame of video in one of three ways. The first way is to generate a self-contained entity referred to as an “intraframe” (also referred to as a “reference frame” and an “information frame”), in which the entire frame is composed of compressed, quantized DCT values. This type of frame is required periodically and at a scene change. Most frames, however, (typically 15 out of 16) are compressed by encoding only differences between the image in the frame and the nearest intraframe, resulting in frame representations that use much less data than is required for an intraframe. In MPEG parlance these frames are called “predicted” frames and “bidirectional” frames, herein collectively referred to as “interframes”.
Predicted frames are those frames that contain motion vector references to the preceding intraframe or to a preceding predicted frame, in accordance with the discussion above. If a block has changed slightly in intensity or color, then the difference between the two frames is also encoded in a predicted frame. Moreover, if something entirely new appears that does not match any previous blocks, then a new block can be stored in the predicted frame in the same way as in an intraframe.
In contrast, a bidirectional frame is used as follows. The MPEG system searches forward and backward through the video stream to match blocks. Bidirectional frames are used to record when something new appears, so that it can be matched to a block in the next full intraframe or predictive frame, with predictive frames being able to refer to both preceding and subsequent bidirectional frames. Experience has shown that two bidirectional frames between each intraframe or predictive frame works well, so that a typical group of frames associated with a single intraframe might be: the full intraframe, followed by a predictive frame, followed by two bidirectional frames, another predictive frame, two more bidirectional frames, a predictive frame, two more bidirectional frames, a predictive frame, and finally two more bidirectional frames, at which point a new full intraframe might be placed in the stream to refresh the stream.
The present invention, in contemplating the above principles, recognizes that several programs might be conveyed in a single channel of finite bandwidth using principles of multiplexing, and that it might happen that a large intraframe of one program might temporally coincide with those of one or more other programs, consuming a large amount of bandwidth for that instant.