Still and moving images generally contain large amounts of information. For example, an RGB color image comprising 2000×2000 pixels requires 12 megabytes of data if the intensity of each RGB color component of each pixel is coded with 8 bits. Transmission and storage of this amount of data even for single images is generally impractical for most purposes. For example, if images of this size are to be transmitted at typical moving picture rates of thirty frames per second, then 360 megabytes of image data must be transmitted every second. This is clearly a very large rate of data transmission. Therefore, data representing an image is generally “compressed”. Compression significantly reduces the amount of data representing the image in comparison to the amount of data required to represent the image by coding RGB intensities (or intensities of a different set of color components used for defining color) for each pixel of an image. Hereinafter, the intensity of a color component used to define color of a pixel in an image is referred to as a “pixel value”.
Different techniques exist for compressing data and in particular for compressing data representing an image. In many of these techniques the image is first partitioned into a plurality of contiguous, non-overlapping, sub-images or tiles wherein each sub-image extends over a portion of the area of the full image. Variation of image intensity and color over the area of a sub-image is generally limited. For each color component used to define color in the image, pixel values of each sub-image are transformed into a set of values, hereinafter referred to as “transform coefficients”, in a transform space. Generally, the pixel values are transformed by operating on the pixel values with a unitary separable transform. The pixel values are recoverable from the transform coefficients by operating on the transform coefficients with an inverse of the transform. Among well known separable unitary transforms used in processing image data are the Fourier, cosine, sine and Hadamard transforms.
Generally, the number of transform coefficients needed to effectively recover the pixel values is considerably less than the number of pixel values. The amount of data needed to code the transform coefficients is therefore generally significantly less than the amount of data needed to code the pixel values. Representing the image using the transform coefficients therefore requires considerably less data than representing the image by coding color component intensities for each pixel. The transform thus succeeds in compressing the image data.
Usually, the transform coefficients are quantized by dividing each transform coefficient by a “quantizer”. Any remainder resulting from the division is rounded up or down and the transform coefficient is replaced by the quotient. Different transform coefficients are generally quantized with quantizers of different magnitudes. Often the magnitude of a quantizer is determined as a function of an error margin for the value of the transform coefficient that the quantizer is used to quantize, which error margin is estimated from known error margins of the pixel values.
Transform space data representing the image after quantizing the transform coefficients comprises the quantized transform coefficients and the quantizer, or a way of determining the quantizer, for each quantized transform coefficient. When recovering the image, i.e. when recovering the pixel values, a transform coefficient, hereinafter referred to as a “recovered transform coefficient”, is recovered from each quantized transform coefficient by multiplying the quantized transform coefficient by its quantizer. The recovered transform coefficients are operated on by an inverse of the transform used to generate the transform coefficients, to determine the pixel values.
Quantization increases the compression of the image data beyond that achieved with the transform. Quantization reduces the range of different numbers representing the transform coefficients, which reduces the number of bits required to code the coefficients. In addition some of the coefficients (those that are less than one half their quantizer) are quantized to zero, which reduces the number of coefficients used to recover the original pixel values. Increasing quantizer magnitudes decreases the possible different values of quantized transform coefficients and increases the number of transform coefficients that are quantized to zero. Increasing quantizer magnitudes therefore generally increases the extent to which image data is compressed.
However, unlike compression resulting from a non-quantized unitary transform, compression from quantization is “lossy”. Information contained in the remainders discarded in the quantization process is lost and the larger the quantizer magnitudes the more lossy is the compression. The recovered transform coefficients are different from the transform coefficients by a quantization error. As a result, pixel values generated using quantized transform coefficients differ from the original pixel values.
The steps in the industry standard JPEG system for compressing color and gray tone images illustrates a typical data compression procedure for a color image. In JPEG a color image is generally partitioned into square tiles of 8×8 pixels. Color of the image is usually defined using YUV color components, so that each pixel has Y, U and V pixel values. For each tile, for each of the YUV color components, the pixel values of the color component are transformed into a set of transform coefficients using a discrete cosine transform. The transform coefficients are then quantized. Usually the transform coefficients associated with higher spatial frequencies are quantized with larger quantizers.
To illustrate the process, let Y(x,y) represent luminance pixel values in a tile, where x and y are integer coordinates that locate a pixel in the tile as being in the x-th row and y-th column of the tile. The transform coefficients are functions of two integer transform space coordinates, “u” and “v” that are conjugate coordinates of x and y respectively.
If C(u,v) represents the transform coefficients, then
      C    ⁡          (              u        ,        v            )        =            ∑              y        =        0            7        ⁢                  ∑                  x          =          0                7            ⁢                        DCT          ⁡                      (                          u              ,              v              ,              x              ,              y                        )                          ⁢                  L          ⁡                      (                          x              ,              y                        )                              where DCT(u,v,x,y) symbolically represents the discrete cosine transform. Writing the discrete cosine transform explicitly gives:
            C      ⁡              (                  u          ,          v                )              =                  A        ⁡                  (          u          )                    ⁢              A        ⁡                  (          v          )                    ⁢                        ∑                      y            =            0                    7                ⁢                              ∑                          x              =              0                        7                    ⁢                                    L              ⁡                              (                                  x                  ,                  y                                )                                      ⁢                          cos              ⁡                              (                                                      (                                                                  2                        ⁢                        x                                            +                      1                                        )                                    ⁢                  u                  ⁢                                                                          ⁢                                      π                    /                    16                                                  )                                      ⁢                          cos              ⁡                              (                                                      (                                                                  2                        ⁢                        y                                            +                      1                                        )                                    ⁢                  v                  ⁢                                                                          ⁢                                      π                    /                    16                                                  )                                                          ,where A(w)=1/√{square root over (2)} for w=0 and 1 otherwise. Each C(u,v) is then quantized with a quantizer, Q(u,v), yielding a quantized transform coefficient QC(u,v)=INT{[C(u,v)±Q(u,v)/2]/Q(u,v)}, where the + is used if C(u,v)>0, the minus is used if C(u,v)<0 and INT represents rounding the result to the nearest integer.
To recover pixel values for luminance from the QC(u,v), recovered transform coefficients, RC(u,v), are calculated where RC(u,v)=QC(u,v)Q(u,v). The recovered transform coefficients are then transformed back with the inverse discrete cosine transform represented by IDCT(x,y,u,v), so that recovered luminance values RY(x,y) may be written:
      RY    ⁡          (              x        ,        y            )        =            ∑              u        =        0            7        ⁢                  ∑                  v          =          0                7            ⁢                        IDCT          ⁡                      (                          x              ,              y              ,              u              ,              v                        )                          ⁢                              RC            ⁡                          (                              u                ,                v                            )                                .                    Explicitly,
      RY    ⁡          (              x        ,        y            )        =            A      ⁡              (        u        )              ⁢          A      ⁡              (        v        )              ⁢                  ∑                  u          =          0                7            ⁢                        ∑                      v            =            0                    7                ⁢                              RC            ⁡                          (                              u                ,                v                            )                                ⁢                      cos            ⁡                          (                                                (                                                            2                      ⁢                      x                                        +                    1                                    )                                ⁢                u                ⁢                                                                  ⁢                                  π                  /                  16                                            )                                ⁢                                    cos              ⁡                              (                                                      (                                                                  2                        ⁢                        y                                            +                      1                                        )                                    ⁢                  v                  ⁢                                                                          ⁢                                      π                    /                    16                                                  )                                      .                              The recovered value for luminance RY(x,y) is generally not equal to Y(x,y) because the quantization process has resulted in a loss of information so that generally RC(u,v)≠C(u,v).
In many instances, information lost in quantizing transform coefficients does not affect the quality of an image recovered from the transform coefficients to an extent that renders the image unusable for the purpose for which it was intended. However, in many instances the loss of information results in objectionable degradation of the recovered image. Often, unwanted artifacts are generated in the recovered image and often the image is degraded unacceptably. For example, a set of compressed transform coefficients might be useable to provide a good “thumbnail” image of a scene but as a result of information loss the transform coefficients may be totally inadequate to provide an enlarged image of the scene.
Techniques exist for adjusting images recovered from compressed data so as to reduce effects on the quality of the recovered image that result from information loss and error in the compressed data. Some techniques address specific types of defects or artifacts, for example “blocking artifacts”, that are generated in images recovered from compressed data that has lost information. Other techniques apply algorithms to reduce pixel to pixel discontinuities or to improve image sharpness by edge enhancement. Most such techniques have limitations. For example, edge-enhancing algorithms typically increase noise in a recovered image and smoothing algorithms tend to blur an image. There is a need for new techniques for recovering images from compressed data that compensate for information loss or errors in the compressed data.
Whereas the above discussion has focused on data representing two dimensional images comprising pixels and associated pixel values of color components or gray tones, it should be realized that the discussion also applies to three dimensional data sets comprising voxels and corresponding “voxel values”. Quite generally, the present discussion is germane to n-dimensional images. An n-dimensional image is defined as a set of values, hereinafter referred to as “image values”, that are dependent on n independent coordinates, which coordinates define an n-dimensional image space. Furthermore it should be noted that the definition of an n-dimensional image is quite general and an n-dimensional image is not restricted to a video image, it means, among other things, a sound image as well. For example a three dimensional sound image might be the notes in a sound file that are a function of time and two stereo channels.
Compression of data representing an n-dimensional image can be similar to compression of a two dimensional video image comprising pixel values as illustrated with the JPEG example. Image values of the n-dimensional image are transformed preferably by a separable unitary transform into a set of transform coefficients dependent upon transform space coordinates of an n-dimensional transform space. The transform coefficients are then quantized using appropriate quantizers. Each of the transform space coordinates is conjugate to a different one of the coordinates of the image space. An image is recovered from transform coefficients by dequantizing the quantized transform coefficients to determine recovered transform coefficients and operating on the recovered transform coefficients with an inverse of the unitary transform. As in the case of two dimensional images the recovered image is degraded by loss of information, or error in the recovered transform coefficients and techniques are needed to compensate for the information loss.