1. Field of the Invention
The present invention relates to devices and methods for coding video signals and more specifically to the coding using adaptive quantization with multiple quantizers and variable length code (VLC) tables. These devices and data compression methods can be readily adapted for H.263+ and MPEG-4 video standards.
2. Description of Related Art
Video signals generally include data corresponding to one or more video frames, where each video frame is composed of an array of picture elements (pels). A typical color video frame can be of a variety of resolutions, one of which is the quarter common interface format (QCIF) resolution represented in FIG. 1, where a video frame is composed of over twenty-five thousand picture elements, arranged in an array of 144 pels.times.176 pels, and divided in 8.times.8 pel blocks. Since each pel has to be characterized with a color (or hue) and luminance characteristics, these data may be represented with groups of four luminance pel blocks or two chrominance pel blocks called macroblocks. Thus, digital signals representing a sequence of video frame data, usually containing many video frames, have a large number of bits. However, the available storage space and bandwidth for transmitting such signals is limited. Therefore, compression (coding) processes are used to more efficiently transmit or store video data.
Compression of digital video signals for transmission or for storage has become widely practiced in a variety of contexts, especially in multimedia environments for video conferencing, video games, Internet image transmissions, digital TV and the like. Coding and decoding is accomplished with coding processors which may be general computers, special hardware or multimedia boards and other suitable processing devices. Standards for compression processes have been developed by the International Telecommunication Union (ITU), which has developed H series standards used for real-time communications such as in videophones, and the International Organization for Standardization (ISO) which has developed the Motion Picture Experts Group (MPEG) series standards, such as MPEG-1, MPEG-2, MPEG-4 and MPEG-7.
Coding operations for video signals are typically performed with smaller samples of the video frame, such as the 16.times.16 pel sample composed of four 8.times.8 pel blocks shown in broken lines in FIG. 1. Compression processes typically involve quantization, in which sampled video signal data values, like color and luminance, are represented by, or are mapped onto, a fixed number of predefined quantizer values. The quantized signal is composed of quantizer values that are, in fact, approximations of the sampled video signal values. Therefore, the encoding of the video signal data onto a limited number of quantizer values necessarily produces some loss in accuracy and distortion of the signal after decoding process.
Improvements in accuracy and distortion may be made by designing the quantizer set to include those quantizer values that have been found by experimentation to have the highest probability of being selected. However, the quantizer values that have the highest selection probability in one portion of a video frame are typically different from the quantizer values that have the highest selection probability in another portion of the frame, and, in a moving video context, such values differ from frame to frame as well.
A compression process of an input signal I, according to the H.263 and MPEG-4 standards, is generally represented by FIG. 2 for Intra blocks and by FIG. 3 for Inter blocks, and involves a discrete cosine transform (DCT) step 12, a quantization step 16, and a variable length coding step 18. Compressed and coded input video data signal is then transmitted on a fixed bandwidth communication link to a receiver (not shown). For Inter blocks, the process shown in FIG. 2 includes an additional motion compensation step 11, carried out prior to the DCT step 12. The motion compensation step 11 involves the summing operation in which a displaced block from a previous frame using motion vector from motion estimation is subtracted from the input video data signal I for a block, to provide a motion compensated difference block as the input to the DCT step 12.
In FIGS. 2 and 3, the DCT step 12 is performed on the signal components associated with the luminance and color values for each processing sample block and provides an output for each block, representing components of each pel of a block as values within a defined range. The range of an output block may be, for example, (-2048, +2048), with the higher values typically positioned toward the top left corner of each block and the values decreasing in the direction of arrow 14 toward the lower right corner of each block.
In the quantization step 16, each DCT value in the block is mapped to a value within a fixed set of quantizer values, based on a quantization scheme. As discussed in more detail below, a preferred quantization scheme in accordance with H.263+ and MPEG-4 video standards involves a step of calculating new values for luminance and color components of each pel of a block. Following the calculation step, a selecting step is carried out, in which approximate values are selected from predefined quantizer values closest to the original input signal values.
In most low bit rate video coding schemes, a calculation step for calculating new values Y' in the quantization process involves a number of calculations to determine a level for Y data value of each pel in the block, using the following equation: EQU level=INT[(Y+offset)/2QP]
where INT is the integer value of the equation.
Y could be a value of luminance or color, offset is preferably -QP/2, and QP is a quantization coefficient representing quantization step size, preferably selected from the range 1.ltoreq.QP.ltoreq.31. The QP value is determined using well known linear bit rate control techniques and it depends on the number of bits allowed by the bit constraints of the system and on the desired image quality.
From the calculated level for each pel, a Y' value is determined, using the following equation: EQU Y'=(2QP.times.level)+QP
In this manner, a Y' value is calculated for each Y value in the block and each Y' value is a multiple of QP, based on the corresponding Y value.
In typical quantization systems, in accordance with the H.263 and MPEG-4 standards, the Y' value is used for selecting a quantizer value from a set of quantizer values represented by the standard quantizer R.sub.0, where each quantizer value is a multiple of QP. The standard quantizer R.sub.0 is defined as follows: EQU R.sub.0 .di-elect cons.{0, .+-.3QP, .+-.5QP, .+-.7QP, . . . }
Each Y' value is mapped onto the closest quantizer value selected from the set R.sub.0 and, thus, each block of Y values is mapped to a corresponding block of quantizer values.
For example, if the Y value for a particular pel is +110 and the selected QP value is 20, then the above equations would render the following values: ##EQU1## If the value of Y is chosen to be within the range -2048.ltoreq.Y.ltoreq.2048, then the quantizer R.sub.0 of estimated Y' values must have a QP-multiple which represents the lowest possible value (-2048) as well as the highest possible value (2048). Using the standard QP-multiples set R.sub.0, the quantizer R.sub.0 would have to extend to .+-.101 QP and is represented as follows: EQU R.sub.0 .di-elect cons.{0, .+-.3QP, .+-.5QP, .+-.7QP, . . . .+-.101QP, . . . .+-.255QP }
The calculated Y' value of 5QP happens to be identical to a QP-multiple value within the set: {0, .+-.3QP, .+-.5QP, .+-.7QP, . . . .+-.101QP, . . . .+-.255QP}. In this example the values above .+-.110 are not used and the Y value 110 of the pel was mapped to the Y' value 5QP. However, the accuracy of the coded information suffers to some extent, as illustrated by the fact that the Y' value of 100 differs from the original Y value of 110 by almost 10% of the Y value.
In the second example, with the same Y value of 110, as above, but with the QP value of 2: EQU level=[(110-2/2)/2(2)]=[27.25]=27
and EQU Y'=2(2).times.27+2=110=55QP.
Thus, in the second example, the Y value 110 of the pel was mapped to the Y' value of 55QP and the calculated Y' value of 110 is equal to the original Y value.
With the relatively low QP value of 2 the number of possible QP-multiples is significantly greater than in the first example. In particular, with -2048 .ltoreq.Y.ltoreq.+2048 and with QP=2, the standard quantizer R.sub.0 of possible Y' values uses the values up to +255QP, as follows: {0, .+-.3QP, .+-.5QP, .+-.7QP, . . . .+-.255QP}. Thus, with the QP value of 2, the number of possible Y' values is increased with respect to the first example, in which the QP value was 20. Accordingly, the accuracy of the Y' value is greater in the second example than in the first example.
The number of bits required to code the possible Y' value is a function of the absolute value of the input signal I, and the higher the magnitude of the signal, the more bits it would take to represent the possible Y' values. Because 27 is significantly greater than 2, the number of bits needed to code the possible Y' value is greater in the second than in the first example. Therefore, the QP value should be selected based on the number of bits that can be transmitted over the transmission channel and the desired image quality, because more accurate coding (mapping) of pel values occurs with lower QP values. Thus, lower QP values may be used with certain portions of the video frame in which higher accuracy is desired, such as portions of the frame in which movement occurs, than in the other portions of the frame. Also, different QP values could be used for luminance and chrominance blocks.
As a result of the DCT step 12, the Y' values toward the lower right corner of the block will tend to be small and, if QP is high enough, the Y' values for many of the pels toward that corner will be zero, as it is shown in FIG. 4, where an "x" represents a value other than zero and "0" represents a value of zero.
After the quantization step 16, the pel values in the block are coded using a variable length coding 18 step with a suitable entropy coding scheme, such as Huffman coding. Huffman coding is preferred because of its low cost, high accuracy and efficiency. With this coding the pel values can be coded with a relatively short binary code sequence and the strings of zeroes (called Runs), which tend to occur toward the lower right corner of the quantized block, are coded together with the code for the next pel in one binary string. The efficiency can be further enhanced by coding the data for each pel in the zig-zag sequence, represented by the arrow 20 in FIG. 4.
At the receiver end, the Huffman-coded transmission is decoded to obtain the reconstructed values for the coded block of Y' values. Afterwards, calculations reversing the above-discussed calculations are performed on the Y' values to obtain representations of the original Y values. A VLC table used for the encoding and decoding in the H.263+standard is given in FIGS. 5a and 5b.