The present invention relates to digital image and video signal processing, and more particularly to block transformation and/or quantization plus inverse quantization and/or inverse transformation.
Various applications for digital video communication and storage exist, and corresponding international standards have been and are continuing to be developed. Low bit rate communications, such as video telephony and conferencing, plus large video file compression, such as motion pictures, led to various video compression standards: H.261, H.263, MPEG-1, MPEG-2, AVS, and so forth. These compression methods rely upon the discrete cosine transform (DCT) or an analogous transform plus quantization of transform coefficients to reduce the number of bits required to encode.
DCT-based compression methods decompose a picture into macroblocks where each macroblock contains four 8×8 luminance blocks plus two 8×8 chrominance blocks, although other block sizes and transform variants could be used. FIG. 2a depicts the functional blocks of DCT-based video encoding. In order to reduce the bit-rate, 8×8 DCT is used to convert the 8×8 blocks (luminance and chrominance) into the frequency domain. Then, the 8×8 blocks of DCT-coefficients are quantized, scanned into a 1-D sequence, and coded by using variable length coding (VLC). For predictive coding in which motion compensation (MC) is involved, inverse-quantization and IDCT are needed for the feedback loop. Except for MC, all the function blocks in FIG. 2a operate on an 8×8 block basis. The rate-control unit in FIG. 2a is responsible for generating the quantization step (qp) in an allowed range and according to the target bit-rate and buffer-fullness to control the DCT-coefficients quantization unit. Indeed, a larger quantization step implies more vanishing and/or smaller quantized coefficients which means fewer and/or shorter codewords and consequent smaller bit rates and files.
There are two kinds of coded macroblocks. An INTRA-coded macroblock is coded independently of previous reference frames. In an INTER-coded macroblock, the motion compensated prediction block from the previous reference frame is first generated for each block (of the current macroblock), then the prediction error block (i.e. the difference block between current block and the prediction block) are encoded.
For INTRA-coded macroblocks, the first (0,0) coefficient in an INTRA-coded 8×8 DCT block is called the DC coefficient, the rest of 63 DCT-coefficients in the block are AC coefficients; while for INTER-coded macroblocks, all 64 DCT-coefficients of an INTER-coded 8×8 DCT block are treated as AC coefficients. The DC coefficients may be quantized with a fixed value of the quantization step, whereas the AC coefficients have quantization steps adjusted according to the bit rate control which compares bit used so far in the encoding of a picture to the allocated number of bits to be used. Further, a quantization matrix (e.g., as in MPEG-4) allows for varying quantization steps among the DCT coefficients.
In particular, the 8×8 two-dimensional DCT is defined as:
      F    ⁢                  ⁢          (              u        ,        v            )        =    ⁢            1      4        ⁢    C    ⁢                  ⁢          (      u      )        ⁢                  ⁢    C    ⁢                  ⁢          (      v      )        ⁢                  ∑                  x          =          0                7            ⁢                          ⁢                        ∑                      y            =            0                    7                ⁢                                  ⁢                  f          ⁢                                          ⁢                      (                          x              ,              y                        )                    ⁢          cos          ⁢                                                    (                                                      2                    ⁢                    x                                    +                  1                                )                            ⁢                                                          ⁢              u              ⁢                                                          ⁢              π                        16                    ⁢          cos          ⁢                                                    (                                                      2                    ⁢                    y                                    +                  1                                )                            ⁢                                                          ⁢              v              ⁢                                                          ⁢              π                        16                              where f(x,y) is the input 8×8 sample block and F(u,v) the output 8×8 transformed block where u,v,x,y=0, 1, . . . , 7; and
      C    ⁢                  ⁢          (      u      )        ,            C      ⁢                          ⁢              (        v        )              =          {                                                  1                              2                                                                                        for                ⁢                                                                  ⁢                u                            ,                              v                =                0                                                                          1                                otherwise                              Note that this transforming has the form of 8×8 matrix multiplications, F=D1×f×D, where “x” denotes 8×8 matrix multiplication and D is the 8×8 matrix with u,x element equal to
  C  ⁢          ⁢      (    u    )    ⁢          ⁢  cos  ⁢                              (                                    2              ⁢              x                        +            1                    )                ⁢        u        ⁢                                  ⁢        π            16        .  
The transform is performed in double precision, and the final transform coefficients are rounded to integer values.
Next, define the quantization of the transform coefficients as
      QF    ⁢                  ⁢          (              u        ,        v            )        =            F      ⁢                          ⁢              (                  u          ,          v                )              QP  where QP is the quantization factor computed in double precision from the quantization step, qp, as an exponential such as: QP=2qp/6. The quantized coefficients are rounded to integer values and are encoded.
The corresponding inverse quantization becomes:F′(u,v)=QF(u,v)*QP with double precision values rounded to integer values.
Lastly, the inverse transformation (reconstructed sample block) is:
                    f                  ⁣          ′                    ⁡              (                  x          ,          y                )              =        ⁢                  1        4            ⁢                        ∑                      u            =            0                    7                ⁢                                  ⁢                              ∑            v            7                    ⁢                                          ⁢                      C            ⁢                                                  ⁢                          (              u              )                        ⁢                                                  ⁢            C            ⁢                                                  ⁢                          (              v              )                        ⁢                                                  ⁢                          F              ′                        ⁢                                                  ⁢                          (                              u                ,                v                            )                        ⁢            cos            ⁢                                                            (                                                            2                      ⁢                      x                                        +                    1                                    )                                ⁢                                                                  ⁢                u                ⁢                                                                  ⁢                π                            16                        ⁢            cos            ⁢                                                            (                                                            2                      ⁢                      y                                        +                    1                                    )                                ⁢                                                                  ⁢                v                ⁢                                                                  ⁢                π                            16                                            ⁢        again with double precision values rounded to integer values.
Various more recent video compression methods, such as the H.264 and AVS standards, simplify the double precision DCT method by using integer transforms in place of the DCT and/or different size blocks. Indeed, define an n×n integer transform matrix, Tn×n, with elements analogous to the 8×8 DCT transform coefficients matrix D. Then, with fn×n and Fn×n denoting the input n×n sample data matrix (block of pixels or residuals) and the output n×n transform-coefficients block, respectively, define the forward n×n integer transform as:Fn×n=Ttn×n×fn×n×Tn×n where “x” denotes n×n matrix multiplication, and the n×n matrix Ttn×n is the transpose of the n×n matrix Tn×n.
For example, as in other existing video standards, in H.264 the smallest coding unit is a macroblock which contains four 8×8 luminance blocks plus two 8×8 chrominance blocks from the two chrominance components. However, as shown in FIG. 3, in H.264 the 8×8 blocks are further divided into 4×4 blocks for transform plus quantization, which leads to a total of twenty-four 4×4 blocks for a macroblock. After the integer transform, the four DC values from each of two chrominance components are pulled together to form two chrominance DC blocks, on which an additional 2×2 transform plus quantization is performed. Similarly, if a macroblock is coded in INTRA 16×16 mode, the sixteen DC values of the sixteen 4×4 luminance blocks are put together to create a 4×4 luminance DC block, on which 4×4 luminance DC transform plus quantization is carried out.
Therefore, in H.264 there are three kinds of transform plus quantization, namely, 4×4 transform plus quantization for twenty-four luminance/chrominance blocks; 2×2 transform plus quantization for two chrominance DC blocks; and 4×4 transform plus quantization for the luminance DC blocks if the macroblock is coded as INTRA 16×16 mode.
The quantization of the transformed coefficients may be exponentials of the quantization step as above or may use lookup tables with integer entries. The inverse quantization mirrors the quantization. And the inverse transform also uses Tn×n, and its transpose analogous to the DCT using D and its transpose for both the forward and inverse transforms.
However, these alternative methods still have computational complexity which could be reduced while maintaining performance.