1. Field of the Invention
The present invention generally relates to color image compression for diverse applications and, more particularly, to a structure for storing Discrete Cosine Transform (DCT) blocks after entropy decoding in a JPEG (Joint Photographic Experts Group) decoder or after the Forward Discrete Cosine Transform (FDCT) in the JPEG encoder to use as an intermediate format.
2. Background Description
The purpose of image compression is to represent images with less data in order to save storage costs or transmission time and costs. The most effective compression is achieved by approximating the original image, rather than reproducing it exactly. The JPEG standard, discussed in detail in xe2x80x9cJPEG Still Image Data Compression Standardxe2x80x9d by Pennebaker and Mitchell, published by Van Nostrand Reinhold, 1993, which is hereby fully incorporated by reference, allows the interchange of images between diverse applications and opens up the capability to provide digital continuous-tone color images in multi-media applications. JPEG is primarily concerned with images that have two spatial dimensions, contain grayscale or color information, and possess no temporal dependence, as distinguished from the MPEG (Moving Pictures Experts Group) standard. The amount of data in a digital image can be extremely large, sometimes being millions of bytes. JPEG compression can reduce the storage requirements by more than an order of magnitude and improve system response time in the process.
One of the basic building blocks for JPEG is the Discrete Cosine Transform (DCT). An important aspect of this transform is that it produces uncorrelated coefficients. Decorrelation of the coefficients is very important for compression because each coefficient can be treated independently without loss of compression efficiency. Another important aspect of the DCT is the ability to quantize the DCT coefficients using visually-weighted quantization values. Since the human visual system response is very dependent on spatial frequency, by decomposing an image into a set of waveforms, each with a particular spatial frequency, it is possible to separate the image structure the eye can see from the image structure that is imperceptible. The DCT provides a good approximation to this decomposition.
The most straightforward way to implement the DCT is to follow the theoretical equations. When this is done, an upper limit of 64 multiplications and 56 additions is required for each one-dimensional (1-D) 8-point DCT. For a full 8xc3x978 DCT done in separable 1-D formatxe2x80x94eight rows and then eight columnsxe2x80x94would require 1,024 multiplications and 896 additions plus additional operations to quantize the coefficients. In order to improve processing speed, fast DCT algorithms have been developed. The origins of some of these algorithms go back to the algorithm for the Fast Fourier Transform (FFT) implementation of the Discrete Fourier Transform (DFT). The most efficient algorithm for the 8xc3x978 DCT requires only 54 multiplications, 464 additions and 6 arithmetic shifts.
The two basic components of an image compression system are the encoder and the decoder. The encoder compresses the xe2x80x9csourcexe2x80x9d image (the original digital image) and provides a compressed data (or coded data) output. The compressed data may be either stored or transmitted, but at some point are fed to the decoder. The decoder recreates or xe2x80x9creconstructsxe2x80x9d an image from the compressed data. In general, a data compression encoding system can be broken into three basic parts: an encoder model, an encoder statistical model, and an entropy encoder. The encoder model generates a sequence of xe2x80x9cdescriptorsxe2x80x9d that is an abstract representation of the image. The statistical model converts these descriptors into symbols and passes them on to the entropy encoder. The entropy encoder, in turn, compresses the symbols to form the compressed data. The encoder may require external tables; that is, tables specified externally when the encoder is invoked. Generally, there are two classes of tables; model tables that are needed in the procedures that generate the descriptors and entropy-coding tables that are needed by the JPEG entropy-coding procedures. JPEG uses two techniques for entropy encoding: Huffman coding and arithmetic coding. Similarly to the encoder, the decoder can be broken into basic parts that have an inverse function relative to the parts of the encoder.
JPEG compressed data contains two classes of segments: entropy-coded segments and marker segments. Other parameters that are needed by many applications are not part of the JPEG compressed data format. Such parameters may be needed as application-specific xe2x80x9cwrappersxe2x80x9d surrounding the JPEG data; e.g., image aspect ratio, pixel shape, orientation of image, etc. Within the JPEG compressed data, the entropy-coded segments contain the entropy-coded data, whereas the marker segments contain header information, tables, and other information required to interpret and decode the compressed image data. Marker segments always begin with a xe2x80x9cmarkerxe2x80x9d, a unique 2-byte code that identifies the function of the segment.
The quest to encode and decode JPEG images as fast as possible continues. For example, high performance color printers, operating with 4-bits per each CMYK (Cyan, Magenta, Yellow, blacK) component, are expected to run at 200 pages/minute. Images may arrive as 600 pixels/inch YCrCb (a color coordinate system used in the development of the JPEG standard), RGB (Red, Green, Blue), or CieLab JPEG images that need to be transformed into 300 pixels/inch CMYK independent JPEG images. Some images may need to be rotated 90xc2x0 and scaled up or down to fit the assigned raster space. In another example, set top boxes for Internet use are expected to use an on board microprocessor to browse (i.e., decode and display) JPEG images on the Internet in 0.2 to 2 seconds. These images may need to be scaled to fit the output display.
It is therefore an object of the present invention to provide a format for storing DCT data that would require minimal computational effort to generate from Huffman entropy data, yet be sufficiently unpacked so that a number of DCT-domain image algorithms could efficiently be applied to the data.
It is another object of the invention to provide a data format that does not impose additional processing costs if the image must be decompressed fully to raster format.
According to the invention, there is provided a novel structure storing the 8xc3x978 Discrete Cosine Transform (DCT) blocks after entropy decoding in a JPEG decoder or after the Forward Discrete Cosine Transform (FDCT) in the JPEG encoder to use as an intermediate format. The format was chosen to speed up the entropy decode and encode processes and is based on the information needed for the JPEG Huffman entropy coding, but lends itself to fast execution of other DCT based transforms.