MPEG-2 (Motion Picture Experts Group-2) and DV (Digital Video) are two popular formats for digital video production used in the broadcasting industry. In both formats, a transform, such as a two-dimensional discrete cosine transform (DCT) is applied to blocks (e.g., four 8×8 blocks per macroblock) of image data (either the pixels themselves or interframe pixel differences corresponding to those pixels). The resulting transform coefficients are then quantized at a selected quantization level where many of the coefficients are typically quantized to a zero value. The quantized coefficients are then run-length encoded to generate part of the compressed video bitstream. In general, greater quantization levels result in more DCT coefficients being quantized to zero and fewer bits being required to represent the image data after performing run-length encoding.
The DCT transforms a block of image data (for example, a block of 8×8 pixels, as shown in FIG. 1) into a new block of transform coefficients (for example, a block of 8×8 DCT coefficients, as shown in FIG. 2). The transform is applied to each block until the entire image has been transformed. At the decoder, the inverse transformation is applied to recover the original image.
For typical images, a large proportion of the signal energy is compacted into a small number of transform coefficients. For example, the first coefficient in FIG. 2 is larger in magnitude than the remaining coefficients. The first coefficient is typically much larger than the other coefficients because it represents the DC energy while the other coefficients represent AC energy in different spatial frequency bands. The remaining coefficients represent energy levels at increasing horizontal frequencies, proceeding from left to right, and at increasing vertical frequencies proceeding from top to bottom. The coefficients at the bottom right corner represent energy levels at diagonal frequencies. Generally these coefficients tend to be small because images rarely contain significant amounts of diagonal information.
In a typical encoding scheme, the transform coefficients corresponding to those blocks of image data in the more-important regions are less severely quantized than those coefficients corresponding to the less-important regions. In this way, relatively more data (i.e., information) is preserved for the more-important regions than for the less-important regions. This is done by limiting the DCT coefficients to a fixed number of bits. The limiting of a coefficient is performed by shifting the coefficient from left to right, and spilling the least significant bits off the end of the register. In this way, the amplitude of the coefficient is also reduced. The number of bits remaining are pre-assigned individually for each of the 8×8 coefficients in the DCT block. The number of bits may be further reduced or increased, as necessary to maintain a constant bit rate.
The effect of quantization on the image may be seen in the block of quantized coefficients shown in FIG. 3. These quantized coefficients are the result of quantizing the DCT coefficients of FIG. 2 to the nearest integer. Many of the coefficients have been quantized to a value of zero. Some of the coefficients have been quantized to a value of +1 or −1.
When quantizing transform coefficients, differing human perceptual importance of the various coefficients may be exploited by varying the relative step-sizes of the quantizers for the different coefficients. The perceptually important coefficients may be quantized with a finer step size than the other. For example, low spatial frequency coefficients may be quantized finely, while the less important high frequency coefficients may be quantized more coarsely. A simple method to achieve different step-sizes is to normalize or weight each coefficient based on its visual importance. All of the normalized coefficients may then be quantized in the same manner, such as rounding to the nearest integer (uniform quantization). Normalization or weighting effectively scales the quantizer from one coefficient to another.
As shown in FIG. 3, many of the transform coefficients are frequently quantized to zero. There may be a few non-zero low-frequency coefficients and a sparse scattering of non-zero high-frequency coefficients, but the majority of coefficients may be quantized to zero. To exploit this phenomenon the two-dimensional array of transform coefficients is reformatted and prioritized into a one-dimensional sequence through a zigzag scanning process, as shown in FIG. 4. An alternate scanning process is shown in FIG. 5.
The zigzag or alternate scan ordering of coefficients results in most of the important non-zero coefficients (in terms of energy and visual perception) being grouped together early in the sequence. These are typically followed by long runs of coefficients that are quantized to zero. These zero-valued coefficients may be efficiently represented through run-length encoding. In run-length encoding, the number (run) of consecutive zero coefficients before a non-zero coefficient is encoded, followed by the non-zero coefficient value.
Processing 8×8 DCT coefficients is computationally intensive and is desirably performed quickly and efficiently. This invention addresses such a need.