Most video applications seek the highest possible perceptual quality for a given set of bit rate constraints. For instance, in low bit rate applications, such as a videophone system, a video encoder may provide higher quality by eliminating the strong visual artifacts at the regions of interest that are visually more important. On the other hand, in higher bit rate applications, visually lossless quality is expected everywhere in the pictures and a video encoder needs to also achieve transparent visual quality. One challenge in obtaining transparent visual quality in high bit rate applications is to preserve details, especially at smooth regions where the loss of details is more visible than at the non-smooth regions because of the texture masking property of the human visual system.
Increasing the bit rate is one of the most straightforward approaches for improving quality. When the bit rate is given, an encoder manipulates its bit allocation module to spend the available bits where the most visual quality improvement can be obtained. In non-real-time applications such as DVD authoring, the video encoder can facilitate a variable-bit-rate (VBR) design to produce video with a constant quality over time for both difficult and easy video content. In such applications, the available bits are appropriately distributed over the different video segments to obtain constant quality. In contrast, a constant-bit-rate (CBR) system assigns the same number of bits to an interval of one or more pictures despite their encoding difficulty and produces visual quality that varies with the video content. For both VBR and CBR encoding systems, an encoder can allocate bits according to perceptual models within a picture. One characteristic of human perception is texture masking, which explains why human eyes are more sensitive to loss of quality in smooth regions than in textured regions. This property can be utilized to increase the number of bits allocated to smooth regions in order to obtain a high visual quality.
The quantization process in a video encoder controls the number of encoded bits and the quality. It is common to adjust the quality by adjusting the quantization parameters (QPs). The quantization parameters may include a quantization step size, a rounding offset, and a scaling matrix. In the current prior art and existing standards, the quantization parameter values are sent explicitly in the bitstream. The encoder has the flexibility to tune quantization parameters and signal the quantization parameters to the decoder. However, the quantization parameter signaling disadvantageously incurs an overhead cost.
One important aspect in improving perceptual quality is to preserve the fine details, such as film grain and computer-generated noise. It is especially important to the smooth areas where the loss of fine details is highly noticeable. A common approach in existing algorithms is to encode these smooth regions, or the video segments that include smooth regions, at finer quantization step sizes. Although common to the current state of the art across many standards, in the following description we will use the International Organization for Standardization/International Electrotechnical Commission (ISO/IEC) Moving Picture Experts Group-2 (MPEG-2) Standard reference software Test Model, Version 5 (hereinafter referred to as “TM5”) to illustrate how higher quality is obtained for smooth regions within a picture.
In TM5, a spatial activity measure is computed for macroblock (MB) j from four 8×8 luminance frame-organized sub-blocks (n=1, . . . , 4) and four luminance field-organized sub-blocks (n=5, . . . , 8) using the original pixel values as follows:
                                          act            j                    =                      1            +                          min              ⁡                              (                                                      vblk                    1                                    ,                                      vblk                    2                                    ,                  …                  ⁢                                                                          ,                                      vblk                    8                                                  )                                                    ,                                  ⁢        where                            (        1        )                                                      vblk            n                    =                                    1              64                        ×                                          ∑                                  k                  =                  1                                64                            ⁢                                                (                                                            P                      k                      n                                        -                                          P                                              mean                        n                                                                              )                                2                                                    ,                                  ⁢        and                            (        2        )                                                      P                          mean              n                                =                                    1              64                        ×                                          ∑                                  k                  =                  1                                64                            ⁢                              P                k                n                                                    ,                            (        3        )            where Pkn represents the sample values in the nth original 8×8 block. actj is then normalized as follows:
                                                        N              ⁢              _act                        j                    =                                                    2                ×                                  act                  j                                            +              avg_act                                                      act                j                            +                              2                ×                                  avg_                  ⁢                  act                                                                    ,                            (        4        )            where avg_act is the average value of actj of the previous encoded picture. On the first picture, avg_act is set to 400. TM5 then obtains mquantj as follows:mquantj=Qj×N_actj,  (5)where Qj is a reference quantization parameter. The final value of mquantj is clipped to the range [1 . . . 31] and is used to indicate the quantization step size during encoding.
Therefore, in a TM5 quantization scheme, a smooth macroblock with a smaller variance has a smaller value of a spatial activity measure actj and a smaller value of N_actj, as well as a finer quantization step size indexed by mquantj. With finer quantization for a smooth macroblock, finer details can be preserved and a higher perceptual quality is obtained. The index mquantj is sent in the bitstream to the decoder.
The syntax in the International Organization for Standardization/International Electrotechnical Commission (ISO/IEC) Moving Picture Experts Group-4 (MPEG-4) Part 10 Advanced Video Coding (AVC) Standard/International Telecommunication Union, Telecommunication Sector (ITU-T) H.264 Recommendation (hereinafter the “MPEG-4 AVC Standard”) also allows quantization parameters to be different for each picture and macroblock. The value of a quantization parameter is an integer and in the range of 0-51. The initial value for each slice can be derived from the syntax element pic_init_qp_minus26. The initial value is modified at the slice layer when a non-zero value of slice_qp_delta is decoded, and is modified further when a non-zero value of mb_qp_delta is decoded at the macroblock layer.
Mathematically, the initial quantization parameters for the slice are computed as follows:SliceQPY=26+pic_init_qp_minus26+slice_qp_delta  (6)
At the macroblock layer, the value of the quantization parameter is derived as follows:QPY=QPY,PREV+mb_qp_delta  (7)where QPY,PREV is the quantization parameter of the previous macroblock in decoding order in the current slice.
Turning to FIG. 1, a typical quantization adjustment method for improving the perceptual quality in a video encoder is indicated generally by the reference numeral 100. The method 100 includes a start block 105 that passes control to a function block 110. The function block 110 analyzes the input video content, and passes control to a loop limit block 115. The loop limit block 115 begins a loop over each macroblock in a picture using a variable i having a range from 1 to the # of macroblocks (MBs), and passes control to a function block 120. The function block 120 adjusts a quantization parameter for a current macroblock i, and passes control to a function block 125. The function block 125 encodes the quantization parameter and the macroblock i, and passes control to a loop limit block 130. The loop limit block 130 ends the loop over each macroblock, and passes control to an end block 199. Hence, in method 100, the quantization parameter adjustment is explicitly signaled. Regarding function block 120, the quantization parameter for the macroblock i is adjusted based on its content and/or the previous encoding results. For example, a smooth macroblock will lower the quantization parameter to improve the perceptual quality. In another example, if the previous macroblocks use more bits than assigned ones, the current macroblock will increase the quantization parameter to consume fewer bits than what is originally assigned. The method 100 ends after all macroblock in the picture are encoded.
Turning to FIG. 2, a typical method for decoding a quantization parameter and macroblock in a video decoder is indicated generally by the reference numeral 200. The method 200 includes a start block 205 that passes control to a loop limit block 210. The loop limit block 210 begins a loop over each macroblock in a picture using a variable i having a range from 1 to the # of macroblocks (MBs), and passes control to a function block 215. The function block 215 decodes the quantization parameter and a current macroblock i, and passes control to a loop limit block 220. The loop limit block 220 ends the loop over each macroblock, and passes control to an end block 299.
In summary, and as previously described, the existing standards support adjusting picture-level and macroblock-level quantization parameters in the encoder to achieve high perceptual quality. The quantization parameter values are absolutely or differentially encoded and are thus explicitly sent in the bitstream. The encoder has the flexibility to tune quantization parameters and signal the quantization parameters to the decoder. However, the explicit quantization parameter signaling disadvantageously incurs an overhead cost.