The present invention relates to the field of digital image processing and, more particularly, to a fast decoding system for transform coded compressed digital images.
For digital image processing, an image is typically represented as an array of data expressing the values of a plurality of picture elements or pixels. Each pixel is a sample of the intensity of the image at a set of coordinates defined by a rectangular grid virtually overlaying the image. The analog signal obtained by sampling the image at the spatial coordinates of a pixel is quantized to a discrete value that is proportional to the amplitude of the intensity or luminosity of the sample. Typically, the data for a pixel comprises a value representing the intensity of the pixel or a plurality of interlaced values representing the intensities of the component colors of the pixel. The image can be reconstructed for viewing by “inverting” the quantized discrete sample values to produce a plurality of colored or white dots mapped to coordinates determined by the pixels of the original image.
While representing an image as an array of discrete pixel values is useful for image processing, the quantity of data required to represent an image is formidable. For example, an image created for display on a typical computer monitor can have a resolution of 640×480 pixels. If the grayscale intensity of a component color of the pixel can be resolved to 256 levels, each component color of each pixel will be represented by 8 bits and the total image will require nearly 1 MB or approximately the storage space required for 300 pages of single spaced text. In addition, digital video may require more than sixty images per second. If digital image data were not compressed before storage or transmission, such large quantities of data would make digital imaging impractical for many applications.
Referring to FIG. 1, a digital image compression system 20 comprises, generally, an encoder 22 for compressing the image data for storage or transmission and a decoder 24 which reverses the encoding process to reconstruct the image for display. The intensities of the image pixels 26, obtained by sampling the original image, are input to the encoder 22. For many applications, the quantity of image pixel data 26 must be reduced to such an extent that lossy compression is necessary. Typically, the spatial domain pixel data is converted or transformed to data in another domain to make it easier to identify the data that is likely to be irrelevant to human viewers and, therefore, can be discarded during the compression process. Several different transformation methods have been proposed for use and are used in digital image processing. On common method is block-based transformation in which the pixels of the image are divided into a plurality of non-overlapping pixel blocks (e.g., an 8-pixel×8-pixel block) and a transformation algorithm is applied to the signal representing the changing intensities of the pixels in each block. One commonly used block-based transformation algorithm is the Discrete Cosine Transform (DCT). The DCT is a reversible transform that converts a spatial domain signal produced by the changing intensities of the pixels of a block to a block of transform coefficients representing the contributions of component intensity variation signals of various frequencies to the overall changes in intensity within the block. Lower frequency components represent slow changes of intensity and high frequency components represent the rapid changes in intensity that characterize object edges and image details. Since a single color characteristically predominates in a small area of an image, the transform coefficients for higher frequency components tend to be small. On the other hand, many image details are visually irrelevant for many applications and many of the small high frequency components can be discarded without visually objectionable distortion of the image.
The blocks of transform coefficients produced by the transformation 28 are scanned 30 in a zigzag pattern capturing an arrangement of the transform coefficients in an order of generally increasing frequency. The scanned transform coefficients are then quantized 32. Quantization is a “rounding off” operation where all transform coefficients having a value within one of plurality of value sub-ranges are mapped to a single value or quantization index or level. The extents of the value sub-ranges are established by a quantization parameter 34 that is typically adjustable to control the bit rate output by the encoder 22. Since transform coefficients representing high frequencies components tend to be small, many are rounded off to zero during quantization effectively discarding the information representing many details in the image and distorting the image when it is reconstructed for display. In lossy compression schemes quantization acts as a control for trading off image quality for bit rate or compression ratio.
Finally, one or more symbol modeling and encoding processes, such as variable length coding 36, are applied to the quantization indices output by the quantizer 32. During symbol modeling, code words or symbols are substituted for the quantization indices. In variable length coding 36, the length of the symbol encoding a quantization index varies according to the probability of occurrence of the quantization index in the data stream output by the quantizer 32. The data stream is further compressed if average length of the code symbols is less than the average length of the quantizer indices that are encoded. The code symbols encoding the compressed image, a code book 38 relating the code symbols and the encoded quantization indices, and the quantization parameter 34 are included in a bitstream 40 that is transmitted from the encoder 22 to a decoder 24 for decoding and display or stored for later decoding.
The decoder 24 reverses the processes of the encoder to convert the code symbols of the compressed bitstream 40 to pixels 44 for display to a viewer. The code book 38 generated during the variable length coding 36 is recovered from the bitstream 40 and used by the variable length decoder 46 to decode the quantization indices. The quantization parameter 34 is also recovered from the bitstream and input to the inverse quantizer 48 to establish the transform coefficient to be output for each quantization index obtained from the variable length decoder 46. Since quantization is a rounding off process, the transform coefficients output by the inverse quantizer 48 will be approximations of the coefficients produced by transformation 28 of the original image and the image produced with these reconstructed transformation coefficients will be a distorted reconstruction of the original image.
Following inverse quantization 48, the scanning process is reversed 50 to rearrange the order of the transform coefficients so that the reconstructed coefficients will appear in the same order as the transform coefficients appeared following transformation of the block of pixels. Finally, inverse transformation 52 is applied to convert the frequency domain data of the reconstructed transform coefficients to spatial domain intensities for pixels 44 in the reconstructed pixel blocks making up the decompressed image.
Image compression trades off data quantity for image distortion and computational effort. DCT-based transformation is reversible, provides good image decorrelation, and requires a generally acceptable level of computational effort. As a result, DCT transform coding is an underlying process for several digital image processing standards including the JPEG (Joint Photographic Experts Group) still image compression standard (ISO/IEC 10918) and the several of the MPEG (Motion Picture Experts Group) video compression standards (e.g., MPEG-2, ISO/IEC 13818).
While the required computation is acceptable for many applications, the computation requirements are not insignificant. Decoding ten 320×224 pixel frames per second requires approximately 16,800 inverse transformations per second. As a result, considerable effort has been devoted to developing efficient implementations of the inverse transformation algorithm. For example, Girod et al., U.S. Pat. No. 6,112,219, disclose a method of performing fast inverse discrete cosine transformation (IDCT) using look-up tables. As a result of symmetries in the DCT and IDCT, the transforms may be performed with a combination of lookup tables and butterfly operations, reducing the number of additions and subtractions required and eliminating the need for multiplication operations to perform the inverse transformation. The method reduces the number of calculations required to perform the inverse transformation at the decoder in exchange for increased storage to retain the table of precalculated results.
Reducing the computational requirements for inverse transformation speeds up the decoding process and can reduce the demand imposed on computational facilities at the decoder. However, inverse transformation is but one step in the decompression process and the remaining steps require several additional computations for each pixel. What is desired, therefore, is a decoding method and apparatus that reduce the time and computational effort required to decode compressed digital images.