1. Field of the Invention
This invention relates to a method and apparatus for efficiently scaling data which has been multi-dimensionally transformed from the real domain.
2. Description of the Related Art
Many types of data, such as radar data, oil well log data and digital image data, can consume a large amount of computer storage space. For example, computerized digital image files can require in excess of 1 MB. Therefore, several formats have been developed which manipulate the data in order to compress it. The discrete cosine transform (DCT) is a known technique for data compression and underlies a number of compression standards.
The mathematical function for a DCT in one dimension is:
                    s        ~            ⁡              (        k        )              =                  c        ⁡                  (          k          )                    ⁢                        ∑                      n            =            0                                N            -            1                          ⁢                              s            ⁡                          (              n              )                                ⁢          cos          ⁢                                                    π                ⁡                                  (                                                            2                      ⁢                      n                                        +                    1                                    )                                            ⁢              k                                      2              ⁢              N                                            ,                where s is the array of N original values, {tilde over (s)} is the array on N transformed values and the constants c are given by        
            c      ⁡              (        k        )              =                                        1            N                          ⁢                                  ⁢        for        ⁢                                  ⁢        k            =      0        ;          ⁢            c      ⁡              (        k        )              =                                        2            N                          ⁢                                  ⁢        for        ⁢                                  ⁢        k            >      0      Taking for example, the manipulation of image data, blocks of data consisting of 8 rows by 8 columns of data samples frequently are operated upon during image resizing processes. Therefore, a two-dimensional DCT calculation is necessary. The equation for a two-dimensional DCT where N=8 is:
                    s        ~            ⁡              (                  i          ,          j                )              =                  c        ⁡                  (                      i            ,            j                    )                    ⁢                        ∑                      n            =            0                    7                ⁢                              ∑                          m              =              0                        7                    ⁢                                    s              ⁡                              (                                  m                  ,                  n                                )                                      ⁢            cos            ⁢                                                            π                  ⁡                                      (                                                                  2                        ⁢                        m                                            +                      1                                        )                                                  ⁢                i                            16                        ⁢            cos            ⁢                                                            π                  ⁡                                      (                                                                  2                        ⁢                        n                                            +                      1                                        )                                                  ⁢                j                            16                                            ,                where s is an 8×8 matrix of 64 values, {tilde over (s)} is an 8×8 matrix of 64 coefficients and the constants c(i,j) are given by        
                    c        ⁡                  (                      i            ,            j                    )                    =              1        8              ,                  ⁢                            when          ⁢                                          ⁢          i                =                              0            ⁢                                                  ⁢            and            ⁢                                                  ⁢            j                    =          0                    ;                          c        ⁡                  (                      i            ,            j                    )                    =                                    1                          4              ⁢                              2                                              ⁢                                          ⁢          if          ⁢                                          ⁢          i                =                                            0              ⁢                                                          ⁢              and              ⁢                                                          ⁢              j                        >                          0              ⁢                                                          ⁢              or              ⁢                                                          ⁢              i                        >                          0              ⁢                                                          ⁢              and              ⁢                                                          ⁢              j                                =          0                      ;                      c        ⁡                  (                      i            ,            j                    )                    =                        1          4                ⁢                                  ⁢        when        ⁢                                  ⁢        i              ,          j      >      0      Because data is taken from the “real” or spatial image domain and transformed into the DCT domain by equations (1) and (2), these DCT operations are referred to as forward Discrete Cosine Transforms (FDCT), or forward transforms.
As previously mentioned, the DCT can be used as an image compression technique which underlies a number of compression standards. These include the well-known Joint Photographic Experts Group (JPEG) and the Moving Picture Experts Group (MPEG) standards. Comprehensive references on the JPEG and MPEG standards include JPEG Still Image Data Compression Standard by William B. Pennebaker and Joan L. Mitchell (®1993 Van Nostrand Reinhold), and MPEG Video Compression Standard by Joan L. Mitchell, William B. Pennebaker, et al (®1997 Chapman & Hall).
Looking at the JPEG method, for example, there are five basic steps. Again taking the example of the manipulation of image data, the first step is to extract an 8×8 pixel block from the image. The second step is to calculate the FDCT for each block. Third, a quantizer rounds off the DCT coefficients according to the specified image quality. Fourth, the quantized, two-dimensional 8×8 block of DCT coefficients are reordered into a one-dimensional vector according to a zig zag scan order. Fifth, the coefficients are compressed using an entropy encoding scheme such as Huffman coding or arithmetic coding. The final compressed data is then written to the output file.
Returning to the first step, source image samples are grouped into 8×8 data matrices, or blocks. The initial image data is frequently converted from normal RGB color space to a luminance/chrominance color space, such as YUV. YUV is a color space scheme that stores information about an image's luminance (brightness) and chrominance (hue). Because the human eye is more sensitive to luminance than chrominance, more information about an image's chrominance can be discarded as compared to luminance data.
Once an 8×8 data block has been extracted from the original image and is in the desired color scheme, the DCT coefficients are computed. The 8×8 matrix is entered into the DCT algorithm, and transformed into 64 unique, two-dimensional spatial frequencies thereby determining the input block's spectrum.
The ultimate goal of this FDCT step is to represent the image data in a different domain using the cosine functions. This can be advantageous because it is a characteristic of cosine functions that most of the spatial frequencies will disappear for images in which the image data changes slightly as a function of space. The image blocks are transformed into numerous curves of different frequencies. Later, when these curves are put back together through an inverse step, a close approximation to the original block is restored.
After the FDCT step, the 8×8 matrix contains “transformed data” (i.e., data which is in the DCT domain) comprised of 64 DCT coefficients in which the first coefficient, commonly referred to as the DC coefficient, is related to the average of the original 64 values in the block. The other coefficients are commonly referred to as AC coefficients.
Up to this point in the JPEG compression process, little actual image compression has occurred. The 8×8 pixel block has simply been converted into an 8×8 matrix of DCT coefficients. The third step involves preparing the matrix for further compression by quantizing each element in the matrix. The JPEG standard gives two exemplary tables of quantization constants, one for luminance and one for chrominance. These constants were derived from experiments on the human visual system. The 64 values used in the quantization matrix are stored in the JPEG compressed data as part of the header, making dequantization of the coefficients possible. The encoder needs to use the same constants to quantize the DCT coefficients.
Each DCT coefficient is divided by its corresponding constant in the quantization table and rounded off to the nearest integer. The result of quantizing the DCT coefficients is that smaller, unimportant coefficients will disappear and larger coefficients will lose unnecessary precision. As a result of this quantization step, some of the original image quality is lost. However, the actual image data lost is often not visible to the human eye at normal magnification.
Quantizing produces a list of streamlined DCT coefficients that can now be very efficiently compressed using either a Huffman or arithmetic encoding scheme. Thus the final step in the JPEG compression algorithm is to encode the data using an entropy encoding scheme. Before the matrix is encoded, it is arranged in a one-dimensional vector in a zigzag order. The coefficients representing low frequencies are moved to the beginning of the vector and the coefficients representing higher frequencies are placed toward the end of the vector. By placing the higher frequencies (which are more likely to be zeros) at the end of the vector, an end of block code can be used to truncate the larger sequence of zeros which permits better overall compression.
Equations (1) and (2) describe the process for performing a FDCT, i.e., taking the data from the real domain into the DCT domain. When it is necessary to reverse this step, i.e., transform the data from the DCT domain to the real domain, a DCT operation known as an Inverse Discrete Cosine Transform (IDCT), or an inverse transform, can be performed. For a one-dimensional, inverse transform, the IDCT is defined as follows:
            s      ⁡              (        n        )              =                  ∑                  k          =          0                          N          -          1                    ⁢                        c          ⁡                      (            k            )                          ⁢                              s            ~                    ⁡                      (            k            )                          ⁢        cos        ⁢                                            π              ⁡                              (                                                      2                    ⁢                    n                                    +                  1                                )                                      ⁢            k                                2            ⁢            N                                ,                where s is the array of N original values, {tilde over (s)} is the array of N transformed values and the constants c are given by        
            c      ⁡              (        k        )              =                                        1            N                          ⁢                                  ⁢        for        ⁢                                  ⁢        k            =      0        ;          ⁢            c      ⁡              (        k        )              =                                        2            N                          ⁢                                  ⁢        for        ⁢                                  ⁢        k            >      0      For an inverse transform in two dimensions where N=8, the IDCT is defined:
            s      ⁡              (                  m          ,          n                )              =                  ∑                  i          =          0                7            ⁢                        ∑                      j            =            0                    7                ⁢                              c            ⁡                          (                              i                ,                j                            )                                ⁢                                    s              ~                        ⁡                          (                              i                ,                j                            )                                ⁢          cos          ⁢                                                    π                ⁡                                  (                                                            2                      ⁢                      m                                        +                    1                                    )                                            ⁢              i                        16                    ⁢          cos          ⁢                                                    π                ⁡                                  (                                                            2                      ⁢                      n                                        +                    1                                    )                                            ⁢              j                        16                                ,                where s is an 8×8 matrix of 64 values, {tilde over (s)} is an 8×8 matrix of 64 coefficients and the constants c(i,j) are given by        
                    c        ⁡                  (                      i            ,            j                    )                    =                                    1            8                    ⁢                                          ⁢          when          ⁢                                          ⁢          i                =                              0            ⁢                                                  ⁢            and            ⁢                                                  ⁢            j                    =          0                      ;                      c        ⁡                  (                      i            ,            j                    )                    =                                    1                          4              ⁢                              2                                              ⁢                                          ⁢          if          ⁢                                          ⁢          i                =                                            0              ⁢                                                          ⁢              and              ⁢                                                          ⁢              j                        >                          0              ⁢                                                          ⁢              or              ⁢                                                          ⁢              i                        >                          0              ⁢                                                          ⁢              and              ⁢                                                          ⁢              j                                =          0                      ;                      c        ⁡                  (                      i            ,            j                    )                    =                        1          4                ⁢                                  ⁢        when        ⁢                                  ⁢        i              ,          j      >      0      
As previously stated, digital images are often transmitted and stored in compressed data formats, such as the previously described JPEG standard. In this context, there often arises the need to scale (i.e., enlarge or reduce) the dimensions of an image that is provided in a compressed data format in order to achieve a suitable image size.
For example, where an image is to be sent in compressed data format to receivers of different computational and output capabilities, it may be necessary to scale the size of the image to match the capabilities of each receiver. For example, some printers are designed to receive images which are of a certain size, but the printers must have the capability of scaling up or scaling down the image size for printing purposes, particularly when the original image was intended for low resolution display output or higher resolution output, as the case may be.
A known method for scaling up an image provided in a transformed data format is illustrated in FIG. 1a. First, a determination is made as to the amount of desired image enlargement B. (Block 101) Next, one 8×8 block of transformed data is retrieved and an IDCT is performed on all 64 coefficients in the block to transform the data into the real or spatial domain. (Blocks 102 & 103)
Once in the real domain, additional real domain pixel or pel values are created by known methods, such as interpolation. (Block 104) This results in the creation of B adjacent data blocks of 64 pixel or pel values per block in each dimension. If, for example, a scale up factor of two (2) was desired, then this step would result in the creation of 2 data blocks in each dimension for a total of four (4) blocks. Then a FDCT operation is performed on the data of the four adjacent 8×8 blocks to return the data to the DCT domain. (Block 105) The process is repeated for all remaining data in the input image. (Block 106)
Thus given a portion of an image in a JPEG/DCT compressed data format consisting of one compressed 8×8 block of image data, scaling up the image by a factor of two in each dimension using a previously known method requires: (1) entropy decoding the data which is in one-dimensional vector format and placing the data in 8×8 blocks; (2) de-quantizing the data; (3) performing 8×8 IDCT operations to inverse transform the transformed blocks of image data; (4) additional interpolation or related operations to scale up the blocks of image data (in the real domain) into four 8×8 blocks of scaled image data; (5) four 8×8 FDCT operations to re-transform the four blocks of scaled image data; (6) quantizing the four 8×8 blocks of data; and (7) placing the four blocks of data in one-dimensional vectors and entropy encoding the data for storage or transmission. Given the mathematical complexity of the FDCT and IDCT operations, such a large number of operations can be computationally time consuming.
A known method for scaling down an image provided in a compressed data format is illustrated in FIG. 1b. First, a determination is made as to the amount of desired image reduction 1/B. (Block 110) If, for example, an image reduction by ⅓ is desired, then 3 adjacent 8×8 blocks of transformed data values (for a total of 64×3 or 192 values) are retrieved from the image. (Block 111) An IDCT is performed on each of the three blocks to transform the pixel or pel data into the real or spatial domain. (Block 112) Once in the real domain, the data is reduced from 3 adjacent 8×8 data blocks into a single 8×8 data block by any one of several, known filtering techniques. (Block 113) Then a FDCT operation is performed on the data of the single 8×8 data block to return the data to the DCT domain. (Block 114) The process is repeated for all remaining data in the input image. (Block 115)
Thus given a portion of an image in a JPEG/DCT compressed data format consisting of four compressed 8×8 blocks of image data, scaling down the image by a factor of two in each dimension using a previously known method requires: (1) entropy decoding the data which is in one-dimensional vector format and placing the data in 8×8 blocks; (2) de-quantizing the data; (3) performing four 8×8 IDCT operations to inverse transform the transformed blocks of image data; (4) additional filtering operations to scale down the blocks of image data (in the real domain) into one 8×8 block of scaled image data; (5) an 8×8 FDCT operation to re-transform the block of scaled image data; (6) quantizing the 8×8 block of data; and (7) placing the block of data in one-dimensional vectors and entropy encoding the data for storage or transmission. As was the case with scaling up an image, the mathematical complexity of the FDCT and IDCT operations used in scaling down an image can involve a large number of operations which are computationally time consuming.
What is needed is an efficient method and apparatus that operates directly on multi-dimensional transformed data to convert it into transformed scaled (i.e. scaled up or scaled down) data.