1. Field of the Invention
This invention relates in general to data processing, and more particularly to faster discrete cosine transforms using scaled terms.
2. Description of Related Art
Transforms, which take data from one domain (e.g., sampled data) to another (e.g., frequency space), are used in many signal and/or image processing applications. Such transforms are used for a variety of applications, including, but not limited to data analysis, feature identification and/or extraction, signal correlation, data compression, or data embedding. Many of these transforms require efficient implementation for real-time and/or fast execution.
Data compression is desirable in many data handling processes, where too much data is present for practical applications using the data. Commonly, compression is used in communication links, to reduce transmission time or required bandwidth. Similarly, compression is preferred in data storage systems, including digital printers and copiers, where “pages” of a document to be printed may be stored temporarily in memory. Here the amount of media space can be substantially reduced with compression. Generally speaking, scanned images, i.e., electronic representations of hard copy documents, are often large, and thus make desirable candidates for compression.
It is well-known in the art to use a discrete cosine transform for data compression. In particular examples referred to herein, the terms images and image processing will be used. However, those skilled in the art will recognize that the present invention is not meant to be limited to processing images but is applicable to processing different data, such as audio data, scientific data, image data, etc.
In combination with other techniques, such as color subsampling, quantization, Huffman coding and run-length coding, the discrete cosine transform can compress a digital color image by a factor of approximately thirty to one with virtually no noticeable image degradation. Because of its usefulness in data compression, the discrete cosine transform is an integral part of several data compression standards used by international standards committees such as the International Standards Organization.
DCT (Discrete Cosine Transform), disseminated by the JPEG committee, is a lossy compression system which reduces data redundancies based on pixel to pixel correlations. Generally, an image does not change very much on a pixel to pixel basis and therefore has what is known as “natural spatial correlation”. In natural scenes, correlation is generalized, but not exact. Noise makes each pixel somewhat different from its neighbors. Moreover, signal and data processing frequently needs to convert the input data into transform coefficients for the purposes of analysis. Often only a quantized version of the coefficients are needed (e.g., JPEG/MPEG data compression or audio/voice compression). Many such applications need to be done fast in real time such as the generation of JPEG data for high speed printers.
Generally, an example of a JPEG DCT compression and decompression system may be had by referencing the Encyclopedia of Graphics File Formats, by J. D. Murray and W. vanRyper, pp. 159–171 (1994, O'Reilly & Associates, Inc.). Further description of the draft JPEG standard may be found, for example, in “JPEG Still Image Data Compression Standard,” by W. Pennebaker and J. Mitchell, 1993 (Van Nostrand Reinhold, New York) or “Discrete Cosine Transform: Algorithms, Advantages and Applications,” by K. Rao and P. Yip, 1990 (Academic Press, San Diego).
The two-dimensional discrete cosine transform is a pair of mathematical equations that transforms one N1×N2 array of numbers to or from another N1×N2 array of numbers. The first array typically represents a square N×N array of spatially determined pixel values which form the digital image. The second array is an array of discrete cosine transform coefficients which represent the image in the frequency domain. This method of representing the image by the coefficients of its frequency components is a special case of the discrete Fourier transform. The discrete Fourier transform is the discrete version of the classic mathematical Fourier transform wherein any periodic waveform may be expressed as a sum of sine and cosine waves of different frequencies and amplitudes. The discrete cosine transform, like the Fourier transform, is thus a transform which transforms a signal from the time domain into the frequency domain and vice versa. With an input image, A, the coefficients for the output “image,” B, are:       B    ⁡          (                        k          1                ,                  k          2                    )        =            ∑              i        =        0                              N          1                -        1              ⁢                  ∑                  j          =          0                                      N            2                    -          1                    ⁢              4        ⁢                  A          ⁡                      (                          i              ,              j                        )                          ⁢                  cos          ⁡                      [                                                            π                  ⁢                                                                          ⁢                                      k                    1                                                                    2                  ⁢                                      N                    1                                                              ⁢                              (                                                      2                    ⁢                    i                                    +                  1                                )                                      ]                          ⁢                  cos          ⁡                      [                                                            π                  ⁢                                                                          ⁢                                      k                    2                                                                    2                  ⁢                                      N                    2                                                              ⁢                              (                                                      2                    ⁢                    j                                    +                  1                                )                                      ]                              
For an image, the input is N2 pixels wide by N1 pixels high; A(i, j) is the intensity of the pixel in row i and column j; B(k1,k2) is the DCT coefficient in row k1 and column k2 of the DCT matrix. All DCT multiplications are real. This lowers the number of required multiplications, as compared to the discrete Fourier transform. For most images, much of the signal energy lies at low frequencies; these appear in the upper left corner of the DCT. The lower right values represent higher frequencies, and are often small enough to be neglected with little visible distortion.
There are two basic discrete cosine transform equations. The first basic equation is the forward discrete cosine transform which transforms the pixel values into the discrete cosine transform coefficients. The second basic equation is the inverse discrete cosine transform which transforms the discrete cosine transform coefficients back into pixel values. Most applications of the discrete cosine transform for images use eight-by-eight arrays wherein N therefore has a value of eight. Assuming then that N has the value of eight when performing the transforms, where f(i, j) are the values of the pixel array and F(u, v) are the values of the discrete cosine transform coefficients, the equations of the discrete cosine transforms are as follows.
The formula for the 2D discrete cosine transform is given by:       F    ⁡          (              u        ,        v            )        =                              C          u                ⁢                  C          v                    4        ⁢                  ∑                  x          =          0                7            ⁢                        ∑                      y            =            0                    7                ⁢                              f            ⁡                          (                              x                ,                y                            )                                ⁢                      cos            ⁡                          (                                                                    (                                                                  2                        ⁢                        x                                            +                      1                                        )                                    ⁢                  u                  ⁢                                                                          ⁢                  π                                16                            )                                ⁢                      cos            ⁡                          (                                                                    (                                                                  2                        ⁢                        y                                            +                      1                                        )                                    ⁢                  v                  ⁢                                                                          ⁢                  π                                16                            )                                          where x, y=spatial coordinates in the spatial domain (0, 1, 2, . . . 7); u, v=coordinates in the transform domain (0, 1, 2, . . . 7);             C      u        =                            1                      2                          ⁢                                  ⁢        for        ⁢                                  ⁢        u            =      0        ,otherwise 1; and             C      v        =                            1                      2                          ⁢                                  ⁢        for        ⁢                                  ⁢        v            =      0        ,otherwise 1. The separable nature of the 2D DCT is exploited by performing a 1D DCT on the eight columns, and then a 1D DCT on the eight rows of the result. Several fast algorithms are available to calculate the 8-point 1D DCT.
As described above, a DCT compressor comprises mainly two parts. The first part transforms highly correlated image data into weakly correlated coefficients using a DCT transform and the second part performs quantization on coefficients to reduce the bit rate for transmission or storage. However, the computational burden in performing a DCT is demanding. For example, to process a one-dimensional DCT of length 8 pixels requires 13 multiplications and 29 additions in currently known fast algorithms. As stated above, the image is divided into square blocks of size 8 by 8 pixels, 16 by 16 pixels or 32 by 32 pixels. Each block is often processed by the one-dimensional DCT in row-by-row fashion followed by column-by-column. On the other hand, different block sizes are selected for compression due to different types of input and different quality requirements on the decompressed data.
In the article, “A Fast DCT-SQ Scheme for Images,” Trans. IEICE, Vol. E-71, No. 11, pp. 1095–1097, November 1988, Y. Arai, T. Agui, and M. Nakajima proposed that many of the DCT multiplications can be formulated as scaling multipliers to the DCT coefficients. The DCT after the multipliers are factored out is called the scaled DCT. The scaled DCT is still orthogonal but no longer normalized, whereas the scaling factors may be restored in a subsequent quantization process. Arai, et al. have demonstrated in their article that only 5 multiplications and 29 additions are required in processing an 8-point scaled DCT.
However, there is a need to further increase the speed of the encoder because more than half of the time in the JPEG encoder is spent in the Forward Discrete Cosine Transform (FDCT) code calculating the two-dimensional (2-D) 8×8 block of 8-bit or 12-bit samples. Currently, the 2-D FDCT is calculated by first calculating eight 1-D horizontal DCTs and then calculating another eight 1-D vertical DCTs using the currently fastest known 1-D DCT, which was first described by Arai, Agui, and Nakajima, as mentioned above. The current process takes 29 additions and 5 multiplication to calculate a scaled version of the 1-D FDCT. The scaling constants are the same for each vertical column and can finally be included in the quantization step. This prior solution saved eight multiplications per 1-D FDCT. However, as stated above, it can be seen that there is a need to provide a faster DCT transform.
It can also be seen that there is a need to provide a method and apparatus for performing discrete cosine transforms with less addition and multiplication steps to increase throughput of an encoder.