It is known from the state of the art to use a sequence of DCT and quantization for compressing digital data, for instance in order to enable an efficient transmission of this digital data. In particular a compression of digital image data is commonly achieved by using a DCT followed by a quantization of the DCT coefficients obtained by the DCT.
In a DCT of one-dimensional digital data, a respective sequence of source values of a predetermined number is transformed into transform coefficients. In video coding, the source values can be for instance pixel values or prediction error values. Each of the resulting transform coefficients represents a certain frequency range present in the source data. DCT of values f( ) into coefficients F( ) is defined as;                     F        ⁡                  (          i          )                    =                                    2            N                          ⁢                                   ⁢                  C          ⁡                      (            i            )                          ⁢                              ∑                          x              =              0                                      N              -              1                                ⁢                                    f              ⁡                              (                x                )                                      ⁢                          cos              ⁡                              (                                                                            (                                                                        2                          ⁢                          x                                                +                        1                                            )                                        ⁢                    i                    ⁢                                                                                   ⁢                    π                                                        2                    ⁢                    N                                                  )                                                          ,          i      =      0        ,    1    ,    …    ⁢                   ,          N      -      1                  C      ⁡              (        k        )              =          {                                                                  1                                  2                                            ,                              k                =                0                                                                                        1              ,                              k                ≠                0                                                        
In this equation, N is the predetermined number of source values in one sequence of source values.
For compression, image data is usually provided in blocks of two-dimensional digital data. For such data DCT is defined as:                                           F            ⁡                          (                              i                ,                j                            )                                =                    ⁢                                    2              N                        ⁢                                                   ⁢                          C              ⁡                              (                i                )                                      ⁢                          C              ⁡                              (                j                )                                      ⁢                                          ∑                                  x                  =                  0                                                  N                  -                  1                                            ⁢                                                ∑                                      y                    =                    0                                                        N                    -                    1                                                  ⁢                                                      f                    ⁡                                          (                                              x                        ,                        y                                            )                                                        ⁢                                      cos                    ⁡                                          (                                                                                                    (                                                                                          2                                ⁢                                x                                                            +                              1                                                        )                                                    ⁢                          i                          ⁢                                                                                                           ⁢                          π                                                                          2                          ⁢                          N                                                                    )                                                                                                                                                  ⁢                                    cos              ⁡                              (                                                                            (                                                                        2                          ⁢                          y                                                +                        1                                            )                                        ⁢                    j                    ⁢                                                                                   ⁢                    π                                                        2                    ⁢                    N                                                  )                                      ,                          i              =                              j                =                0                                      ,            1            ,            …            ⁢                                                   ,                          N              -              1                                                      C      ⁡              (        k        )              =          {                                                                  1                                  2                                            ,                              k                =                0                                                                                        1              ,                              k                ≠                0                                                        
DCT is a separable operation. That means that a two-dimensional DCT can be calculated with two consecutive one-dimensional DCT operations. Using the one-dimensional DCT operation is preferred because the complexity of a one-dimensional DCT is relative to N. while the complexity of a two-dimensional DCT is relative to N2. For image data having a size of N*N, the total complexity of all the DCT operations is relative to N3 or N2log(N) for fast DCT. Thus large transforms, which also involve many non-trivial multiplications, are computationally very complex. Furthermore, the additionally required accuracy in bits may increase the word width. For complexity reasons DCT is commonly performed only for small block of values at a time, for example 4×4 or 8×8 values, which can be represented in form of a matrix with values f( ). FIG. 1 illustrates a DCT of such a 4×4 matrix 1.
First, each row of the matrix 1 is transformed separately to form a once transformed matrix 2. In the depicted matrix 1, a separate transformation of each row is indicated by 2-headed arrows embracing all values of the respective row. Then, each column of the once transformed matrix 2 is transformed separately to form the final transformed matrix 3 comprising the transform coefficients F( ). In the depicted matrix 2, a separate transformation of each column is indicated by 2-headed arrows embracing all values of the respective column.
The DCT defined by the above equation can also be written in matrix form. To this end, F(i) is first written in a more suitable form                     F        ⁡                  (          i          )                    =                        ∑                      x            =            0                                N            -            1                          ⁢                              f            ⁡                          (              x              )                                ⁢                      A            ⁡                          (                              i                ,                x                            )                                            ,          i      =      0        ,    1    ,    …    ⁢                   ,          N      -      1                  A      ⁡              (                  i          ,          x                )              =                            2          N                    ⁢                           ⁢              C        ⁡                  (          i          )                    ⁢              cos        ⁡                  (                                                    (                                                      2                    ⁢                    x                                    +                  1                                )                            ⁢              i              ⁢                                                           ⁢              π                                      2              ⁢              N                                )                    
Matrix A is a matrix of DCT basis functions. A two dimensional DCT can then be calculated with:Y=AXAT,where matrix X denotes a source value matrix, and where matrix Y denotes the transform coefficients resulting in the DCT. The index T of a matrix indicates that the transpose of the matrix is meant.
After DCT, the actual compression is achieved by quantization of DCT coefficients. Quantization is achieved by dividing the transform coefficients with quantization values that depend on a quantization parameter qp:Y′(i,j)=Y(i,j)/Q(qp)(i,j),where Q(qp) is a quantization matrix, and where Y′(i,j) constitute quantized coefficients. The simplest form of quantization is uniform quantization where the quantization matrix is populated with one constant, for example: Q(qp)(i,j)=qp.
The quantized coefficients constitute compressed digital data which has for example, after the encoding and possible further processing steps, a convenient form for transmission of said data.
When the compressed data is to be presented again after storing and/or transmission, it has first to be decompressed again.
Decompression is performed by reversing the operations done during compression. Thus, the quantized coefficients Y′(i,j) are inverse quantized in a first step by multiplying the quantized coefficients with values of quantization matrix:Y(i,j)Y′(i,j)Q(qp)(i,j)
Next, the dequantized but still transformed coefficients Y(i,j) are inverse transformed in a second step by an inverse discrete cosine transform (IDCT):X=ATYA,where matrix Y denotes as in the DCT the transformed coefficients, and where matrix X denotes the regained source value matrix.
If infinite precision is used for all calculations, X will contain exactly the original pixel values. In practice, however, the coefficients are converted to integer values at least after quantization and inverse transform. As a result, the original pixels can not be exactly reconstructed. The more compression is achieved, the more deviation there is from the original pixels.
If the above described DCT and IDCT are implemented straight-forward, each conversion requires several multiplications, additions and/or subtractions. These operations, however, require on the one hand a significant amount of processor time, and on the other hand, multiplications are quite expensive operations with respect to circuit area in some architectures. In order to be able to transmit for example high quality motion displays, it is thus desirable to dispose of a conversion process which requires fewer multiplication steps without reducing the quality of the data regained in decompression.
Since the DCT is also a central operation in many image coding standards, it has been widely used, and a variety of solutions for the stated problem has been described in literature. These solutions generally feature the “butterfly operation” and/or combine some calculations from the operator matrix to the quantization step at the end of the DCT process.
The U.S. Pat. No. 5,523,847 describes for example a digital image processor for color image compression. In order to reduce the number of non-trivial multiplications in DCT, it is proposed in this document to factor the transform matrix in a way that decreases the number of non-trivial multiplications, non-trivial multiplications being multiplications or divisions by a factor other than a power of two. The trivial multiplications can be realized by bit-shifting, hence the name ‘trivial’. More specifically, the transform matrix is factored into a diagonal factor and a scaled factor such that the diagonal factor can be absorbed into a later quantization step that and the scaled factor can be multiplied by a data vector with a minimum of non-trivial multiplications. In addition, it is proposed that remaining non-trivial multiplications are approximated by multiplications by rational numbers, since the computation can then be achieved only with additions, subtractions and shift operations. This leads to a problem in IDCT, however, since with the approximation, there may not exist an exact inverse transform any more for the transform. Therefore, repeating the DCT-IDCT process may result in severe deterioration in image quality. This may happen, e.g., when the image is transmitted several times over a communications link where DCT compression is utilized,
Another approach is described in a document by Gisle Bjontegaard: “H.26L Test Model Long Term Number 7 (TML-7) draft0”, ITU Video Coding Experts Group, 13th Meeting, Austin, Tex., USA 2-4 Apr., 2001, This document describes a DCT solution constituting the current test model for a compression method for ITU-T recommendation H.26L.
According to this document, instead of DCT, an integer transform can be used which has basically the same coding property as a 4×4 DCT. In the integer transform, four transform coefficients are obtained from four source data pixels respectively by four linear equations summing the pixels with predetermined weights. The transform is followed or preceded by a quantization/dequantization process, which performs a normal quantization/dequantization. Moreover, a normalization is required, since it is more efficient to transmit data having a normalized distribution than to transmit random data. Since the transform does not contain a normalization, the quantization/dequantization carries out in addition the normalization which is completed by a final shift after inverse transform. The quantization/dequantization uses 32 different quality parameter (QP) values, which are arranged in a way that there is an increase in the step size of about 12% from one QP to the next. Disadvantage of this approach is that it requires a 32-bit arithmetic and a rather high number of operations.
Another document proceeds from the cited TML-7 document: “A 16-bit architecture for H.26L, treating DCT transforms and quantization”, Document VCEG-M16, Video Coding Experts Group (VCEG) 13th meeting, Austin, USA, 2-4 Apr., 2001, by Jie Liang, Trac Tran, and Pankaj Topiwala. This VCEG-M16 document mainly addresses a 4×4 transform, and proposes a fast approximation of the 4-point DCT, named the binDCT, for the H.26L standard. This binDCT can be implemented with only addition and right-shift operations. The proposed solution can be implemented to be fully invertible for lossless coding.
The proposed binDCT is based on the known Chen-Wang plane rotation-based factorization of the DCT matrix. For a 16-bit implementation of the binDCT, a lifting scheme is used to obtain a fast approximation of the DCT. Each employed lifting step is a biorthogonal transform, and its inverse also has a simple lifting structure. This means that to invert a lifting step, it is subtracted out what was added in at the forward transform. Hence, the original signal can still be perfectly reconstructed even if the floating-point multiplication results are rounded to integers in the lifting steps, as long as the same procedure is applied to both the forward and the inverse transform.
To obtain a fast implementation, the floating-point lifting coefficients are further approximated by rational numbers in the format of k/2m, where k and m are integers, which can be implemented by only shift and addition operations. To further reduce the complexity of the lifting-based fast DCT, a scaled lifting structure is used to represent the plane rotation. The scaling factors can be absorbed in the quantization stage.
The solution proposed in the VCEG-M16 document only requires 16-bit operations assuming that the source values are 9-bit values and less operations than the solution of the TML-7 document. More specifically, it requires for a 1-D DCT of four data values 10 additions and 5 shifts.
Further documents relating to image data compression are for instance the following, the contents of which are only addressed briefly:
U.S. Pat. No. 6,189,021, granted Feb. 13, 2001, proposes to employ a set of scaled weighting coefficients in the intrinsic multiplication stage of a six-stage DCT fast algorithm for one of two one-dimensional DCT operations so that a corresponding stage of the DCT fast algorithm for the other one of the one-dimensional DCT operations can be omitted.
U.S. Pat. No. 5,129,015, granted Jul. 7, 1992, makes use of a method similar to DCT but employing a simpler arithmetic for compressing still images without multiplication.
U.S. Pat. No. 5,572,236, granted Nov. 5, 1996, relates to a digital image processor for color image compression which minimizes the number of non-trivial multiplications in the DCT process by rearranging the DCT process such that non-trivial multiplications are combined in a single process step.
PCT application WO 01/31906, published May 3, 2001, relates to a transform-based image compression framework called local zerotree coding.