1. Field of the Invention
The present invention relates to video decoding. More particularly, the present invention relates to decoding video data encoded under one of the MPEG standards.
2. Discussion of Related Art
The Motion Picture Experts Group (MPEG) has promulgated two encoding standards for full-motion digital video and audio, popularly referred to as “MPEG-1” and “MPEG-2”, which provide efficient data transmission. MPEG encoding techniques can be used in digital video such as high definition television (HDTV). A publication describing MPEG-1 and MPEG-2 encoding and decoding techniques, Mitchell, J., Pennebaker, W., Fogg, C., and LeGall, D., MPEG Video Compression Standard, Chapman and Hall, New York, N.Y. (1996), is incorporated herein by reference. The detailed description below is applicable to both MPEG-1 and MPEG-2 standards, unless otherwise provided. To simplify the description, where the description is applicable to both MPEG-1 and MPEG-2 standards, the term “MPEG” refers to both standards.
Under either MPEG standard, a video sequence is organized as a series of “pictures”. Each picture can be one of three types: predicted pictures (P-pictures), intra-coded pictures (I-pictures), and bidirectionally coded pictures (B-pictures). I-pictures are encoded without respect to other pictures. Each P-picture or B-picture is encoded as a set of differences with respect to one or more reference pictures, which can be I-pictures or P-pictures.
Each picture is further divided into data sections known as “slices”, each consisting of a number of “macroblocks,” which are each organized as eight or twelve 8-pixel by 8-pixel (8×8) blocks. Under one level of color precision, a macroblock includes four 8×8 blocks of brightness (luminance) samples, two 8×8 blocks of “red” samples (“red-chrominance”), and two 8×8 blocks of “blue”. (“blue-chrominance”) samples. Under this level of color precision, red-chrominance and blue-chrominance samples are sampled only half as often as the luminance samples. Under another level of color precision, a macroblock includes four 8×8 luminance blocks, four 8×8 red-chrominance blocks, and four 8×8 blue-chrominance blocks. Information regarding each macroblock is provided by a macroblock header which identifies (a) the position of the macroblock relative to the position of the most recently coded macroblock, (b) which of the 8×8 blocks within the macroblock are encoded as intra-blocks (i.e., without reference to blocks from other pictures), and (c) whether a new set of quantization constants is to be used.
The first step in encoding the 8×8 blocks is to transform each block into the frequency domain using a 2-dimensional discrete cosine transform (DCT). The applicable 2-dimensional DCT consists of a “horizontal” and a “vertical” spatial DCT, as is known in the art. DCT represents the luminance or chrominance values of a block as a set of coefficients in a sum of cosine functions. Next, each coefficient of the block in frequency space is “quantized.” For I-pictures, quantization is intended to reduce the coefficients of the higher frequencies to zero. For P-pictures and B-pictures, which represent temporal changes in the luminance or chrominance values over time, quantization also reduces many of the coefficients to zero. The quantized coefficients can be achieved by dividing each coefficient of a block by a corresponding integer quantization constant, and then rounding the result to the nearest integer.
The 2-dimensional blocks are then read as a linear list of values by scanning the values of the 8×8 block under a “zigzag scanning order.” MPEG-2 specifies two zigzag scanning orders, which are depicted in FIG. 1. Under either of these zigzag scanning orders, zero coefficients tend to congregate or “run” next to each other, allowing a compact representation (a “run-level” pair, as described below). An end-of-block symbol is used to indicate that all remaining coefficients in the zigzag scanned list are zero.
All non-zero coefficients, other than the DC-coefficient, defined below, are then represented using a “run-level” coding. “Level” is the amplitude of a non-zero coefficient. “Run” is the number of zero-amplitude coefficients between the most recent non-zero coefficient and the present non-zero coefficient. For I-pictures, the DC-coefficient, which is the zero-frequency coefficient, is represented as a difference from the DC-coefficient of the most recent reference block of the same block type (i.e., luminance, red-chrominance, or blue-chrominance). Next, the “run-level” encoded lists are transformed into variable-length codes using a Huffman coding technique. Huffman coding assigns shorter codes to more frequently occurring values. (The macroblock header is also encoded).
A conventional decoding process 200 of an MPEG block is depicted schematically in FIG. 2. An MPEG decoder receives an input encoded video data stream (“bitstream”) from a video data source, such as a satellite transmitter, a disk, or a DVD ROM. The bitstream consists of variable-length codes obtained using an encoding process described above. As shown in FIG. 2, a bitstream fetch operation 202 captures the bitstream. A decode operation 204 then recovers the run, level, and length of each variable-length code, according to the encoding standard used and the picture type. Typically, the variable-length codes are decoded using a table look-up technique. To recover the current DC-coefficient, the DC-coefficient of the most recent I-picture encoded block of the same block type is added to the present DC-coefficient.
The next step of decoding process 200 is depicted in FIG. 2 as inverse scan 206. Inverse scan 206 assigns the coefficients from the variable length decode operation 204 into 8×8 blocks. Next, an inverse quantization step 208 multiplies each coefficient in an 8×8 block obtained from inverse scan 206 by the same corresponding quantization constant used in the quantization procedure during encoding, and rounds the result to the nearest integer. In addition, to compensate for precision losses during encoding and decoding, an “oddification” step (MPEG-1) or a “mismatch control” step (MPEG-2) is applied during inverse quantization procedure 208.
Next, an inverse discrete cosine transform (IDCT) 210, such as described by Mitchell, J., Pennebaker, W., Fogg, C., and LeGall, D., MPEG Video Compression Standard, Chapman and Hall, New York, N.Y. (1996), is applied to the 8×8 blocks to return the blocks to a time domain representation, which is also known as a spatial domain representation.
In the prior art, the decoding process 200 described thus far, i.e., from the bitstream fetch operation 202 to the IDCT 210, is already too computationally demanding for decoding using a typical conventional microprocessor. For example, a DVD player using only an Intel x86 CPU to decode MPEG data cannot perform the above decoding process fast enough. Even at 200 MHz, an x86 CPU must dedicate all of its resources to process video. Even then, some frames would be lost.
A DVD player with a separate MPEG decoder and an x86 CPU achieves better results. With a separate MPEG decoder, the demand on the x86 CPU is significantly diminished. However, there are several drawbacks to a separate MPEG decoder. First, partitioning the decoding tasks between the processors is complex, especially when the processors execute different instruction sets. Second, a separate MPEG decoder results in higher costs for MPEG decoding. Third, even then, MPEG decoding for replay on a HO-type HDTV is still not quick enough to avoid frame loss.
Therefore, what is needed is an MPEG decoder which decodes variable-length codes for replay on a HO-type HDTV quickly enough to avoid frame loss but without the expense and complexity of a dedicated MPEG decoder.