1. Field of the Invention
The invention relates to a quantization matrix adjusting method, and more particularly to a quantization matrix adjusting method for avoiding underflow of video buffer verifier (VBV) by scaling up the quantization matrix.
2. Description of the Related Art
Compressing methods are typically used to compress the pictures in order to reduce the data amounts, for example coding pictures by MPEG. The basic unit of coding within a picture is the macroblock. If the sampling is in 4:2:0 format, there are six blocks within a macroblock: four Y blocks, one Cb block and one Cr block. Each block is first DCT (discrete cosine transform) transformed, and then the DCT coefficients have to be quantized into integer. A zigzag scan or other alternate scanning method is utilized to arrange the two-dimensional quantized coefficients into one-dimensional. Finally, a variable-length coding (VLC) is employed to do entropy coding.
The quantization step is where the compression happened. Generally, the quantization of a DCT coefficient F[v] [u] can be represented as:QF[v][u]=16*F[v][u]/(Q*W[v][u])  (1)
Where the v, u are the indexes for the two dimensional matrix, ranged from 0 to 7. The Q is the quantizer scale for the blocks within a macroblock, which can be varied for each macroblock. The W[v][u] is a quantization matrix defined for the whole pictures, which is used to assign a weighting factor for different DCT coefficients. FIG. 1 is a default quantization matrix for intra blocks defined in MPEG-2. Referring to FIG. 1, the W[v][u] is getting larger when the v and u indexes increase. This causes that the quantized coefficients with higher frequency more easily becomes zero. This is based on the study of human visual system that human is more sensitive to lower frequency signal and less sensitive to higher frequency.
The inverse-quantization step is used to recover the DCT coefficients. Generally, the inverse-quantization of a quantized DCT coefficient QF[v][u] is defined as:F′[v][u]=QF[v][u]*Q*W[v][u]/16  (2)                where F′[v][u] is the recovered DCT coefficient. The difference between the original DCT coefficient F[v][u] and the inverse-quantized DCT coefficient F′[v][u] is called quantization error, which is defined as:E[v][u]=F[v][u]−F′[v][u]  (3)        
The adaptive quantization adapted in MPEG-1 and MPEG-2 can freely change the Q from one macroblock to the next macroblock, but can only slightly adjust the Q between adjacent macroblocks in MPEG-4. The MPEG-2 has a non-linear quantizer scale mapping the quantizer scale code (from 1 to 31) to a real quantizer scale (from 0.5 to 56).
Adjusting the quantizer scale Q can control the bit consumption and coded quality of a macroblock. Given a larger Q will cause the quantized DCT coefficients getting smaller, and more quantized DCT coefficients became zero. Thus, the coded bit stream after VLC becomes shorter. However, the consequence is that the quantization error is getting larger, and the decoded image quality getting worse. If we want to have a better image quality, setting a smaller Q can reduce the quantization error, but the coded bit stream will getting longer.
In typical applications, the bit-rate of MPEG bit stream is constrained. For example, the DVD standard defines that the bit-rate of a MPEG-2 video stream can't higher than 9.8 Mb/sec, so the encoder have to control the bit-rate consumption to satisfy the constrain.
The picture contents vary from picture to picture in a video sequence. The coding complexity in a portion of video sequence may be different to other portions. The video buffer verifier (hereinafter called VBV) employed in MPEG-1 can offer somewhat flexibility in bit-rate consumption for different pictures, but overall bit-rate in a long term should be a constant bit rate (hereinafter called CBR). The MPEG-2 introduces the variable bit rate (VBR) operation mode to provide more flexibility on the variation of bit-rate consumption for each picture. The VBV buffer is used to emulate the input buffer of the MPEG decoder, and the bit stream produced by a MPEG encoder can not violate the constraint on the VBV buffer, otherwise the bit stream wouldn't be properly decoded. The CBR operation of the VBV is shown by example in FIG. 2. The figure depicts the fullness of the decoder buffer over time. The sloped line segments show the compressed data entering the buffer at a constant bit-rate. The vertical line segments show the instantaneous removal from the buffer of the data associated with the decoded picture.
For the bit stream to satisfy the MPEG rate control requirement, the data for a picture must be already available in the buffer when the decoder has to decode the picture, and the decoder buffer does not overfill. By referencing to FIG. 2, if a picture consumed too many bits, the VBV buffer may be underflow, therefore the upper bound of the allocatable bits is UB. Similarly, if a picture used too few bits, the VBV buffer may be overflow, and thus the lower bound of the allocatable bits is LB. The VBR operation of the VBV is shown by example in FIG. 3. The difference between VBR and CBR is that the compressed bit stream enters the buffer at a specified maximum bit-rate until the buffer is full, when no more bits are input. This translates to a bit-rate entering the buffer that may be effectively variable, up to the maximum specified rate. As shown in the FIG. 3, there is only a constraint on the allocatable maximal bit rate UB, but no minimal bit rate.
To satisfy the constraint of VBV, the encoder has to allocate a bit budget for each picture, and then try to control the actually consumed bits close to the allocated bit budget. Typically, a virtual buffer mechanism is employed to control the bit consumption. If the buffer occupancy lower than zero, it maps to the quantizer scale 1, and if the buffer occupancy higher than a threshold R, it maps to the quantizer scale 31. The R is called the reaction factor. Before coding a picture, the virtual buffer has an initial occupancy D=d0=R*q0/31, and this occupancy corresponds to a quantizer scale Q=q0. Use the Q to quantize the first macroblock and encode. If the consumed bits are larger than the average bit budget for a macroblock, the virtual buffer occupancy D increases, otherwise the D decreases. If the buffer occupancy D increases over d0+R/31, the quantizer scale Q will become q0+1. This means that the excess of bit consumption is too large, so increase the quantizer scale to try to reduce the rate of bit consumption. If the buffer occupancy D decreases lower than d0−R/31, the quantizer scale Q will became q0−1. This means that the bit consumption is lower than expected, so decrease the quantizer scale to try to increase the rate of bit consumption.
The problem of using the virtual buffer mechanism to control the bit consumption is how to assign the initial buffer occupancy d0. This initial occupancy can be treat as an estimation of the coding complexity of the current picture. With a fixed bit budget, if the coding complexity of the current picture is relatively high, a higher initial occupancy should be selected so that the quantizer scale for each macroblock is higher. But we don't know the coding complexity of the current picture until the picture is actually coded. One solution is to inherit the buffer occupancy of the previous coded picture. This simple technique can handle most cases with no serious problem, but would cause VBV underflow when there is a rapidly variation of image content in a video sequence, such as senses change.
The quantizer scale for a macroblock can be selected from 1 to 31; this dynamic range can handle the most cases of video coding. However, if a video sequence is very complex and the target bit rate is low, the VBV underflow seems to be unavoidable. Even if the quantizer scale for each macroblock is set to 31, there are still too many bits consumed for each macroblock. The U.S. Pat. No. 5,801,779 introduces a “Panic Mode” to overcome this situation. The basic idea is that encoder monitors the VBV buffer occupancy after coding each macroblock. If too many bits are consumed and there would be VBV underflow after coding the whole picture, the encoder enters the Panic Mode. In this mode, the encoder would prefer to encode a macroblock with inter-mode, and no residuals are coded. If a macroblock is coded with intra-mode (The macroblocks of I-picture must be intra-mode), only a few numbers of coefficients are coded, making other coefficients zero to reduce the bits. The picture quality would degrade significantly, but the VBV underflow could be avoid.