1. Field of the Invention
This invention relates to the field of data compression and decompression, in particular to compression and decompression of still and moving digital color images.
2. Background Art
Various compression schemes have been developed to reduce the transmission bandwidth and storage requirements for image and video data without information loss. Prior-art compression schemes include LZW and PNG, JPEG-DPCM, PhotoJazz, JPEG-LS, fractals, and wavelets. Understanding of this invention depends on a background in prior-art theory and practice relating to color compression, spatial image compression, and temporal moving-image compression.
Data compression refers to any process wherein data is converted from a given representation into a smaller representation, that is, a format occupying fewer bits, from which it can subsequently be decompressed back to the original representation. Data compression systems are well known in the art, and are used to reduce the amount of space required to store the data and the amount of time or bandwidth to transmit the data. Although digital storage and transmission costs have decreased at a nearly constant geometric rate for 30 years, the demand for digital storage and transmission have increased at an even greater nearly constant geometric rate, so the need for compression can be expected to continue to increase. For many applications whose feasibility would otherwise be delayed for years, data compression is an early-enabling technology.
Because of the huge memory and bandwidth requirements of digital still images and even more so of digital moving images, data compression has become an essential component of most digital imaging systems. Digital image compression systems are broadly classifiable into lossless (reversible) methods, which compress the image by packing the information more efficiently and yield the identical original image on decompression; versus lossy methods, which “compress” the image by discarding information that may be perceptually less important and on decompression yield a stand-in image which is generally still recognizable by the human visual system and retains its perceptual quality to varying degrees. The compressive power of lossless image compressors is limited by the inherent information (entropy) of the images; for R′G′B′ photoquality naturalistic imagery, the mean compression power of existing systems ranges from around 1.4 for DPCM and 1.6 to 1.8 for string-matching algorithms such as LZW and PNG, through 2.2 for wavelet and fractal methods such as STiNG and LuraWave, up to 2.5 for PhotoJazz. The present invention, with a mean compression power of around 2.2, is thus comparable to the best of prior art. In contrast, there is no theoretical limit to the compressive power of lossy algorithms, although achieving infinite compressive power, by discarding all specific image information, forfeits all ability to distinguish between different images. Lossily compressed images are generally suitable only for direct viewing, and not amenable to further processing. Lossless compression is preferred or required for images that are difficult or impossible to replace or may undergo further processing, such as medical, scientific, satellite, and many other digital images. Lossless compression is also required to avoid cumulative loss in the storage of intermediate images in editing and other multistep processing.
A digital image specifies a discrete pixel value as a function of two discrete spatial variables, commonly referred to as the vertical and horizontal dimensions. Sometimes, such as in scanning for tomography, a third spatial dimension, often referred to as the longitudinal dimension, is added, the longitudinal sequence of slices comprising a 3-dimensional image. Often a time dimension is added, yielding a moving image. The pixel value itself may be a scalar, yielding a one-component image, or a vector, yielding a multicomponent image; for visual images, these are commonly known as monochrome (or grey-scale) and polychrome (or multispectral) images, respectively, the most common case being a trichrome image with spectral components RGB (Red,Green,Blue), corresponding to the spectral resolution of the human visual system. Often the pixel also includes relatively unrelated channels, such as alpha channels or spot-color channels.
In lossless digital image compression, the pixels are conceptually often compressed and decompressed one at a time, in a particular scan order, such as early to late (for a moving image), front to back (for a 3-d image), top to bottom, and left to right. Within a pixel, however, the components are almost always compressed and decompressed independently. In compression, imposing a scan order permits prior image and residue samples to serve as a causal context for prediction filters and probability models. In some schemes, the image is tiled into independently compressed blocks of data, facilitating random access to portions of the image and enhancing the parallelizability of the compression process, but at the cost of reduced compressibility.
Some lossless image compressors of prior art use a quantitative predictor to improve the performance of a subsequent encoder which encodes the prediction residue instead of the original image. For example, the current lossless image-compression standards TIFF-LZW, JPEG-DPCM, and PNG all optionally use a simple predictor (such as the value of the corresponding component of the preceding pixel) prior to encoding. The use of a slightly larger context, where the predicted value is a numerical combination of contextual values, is also known. For example, PNG optionally uses either 1 or a combination of 3 contextual samples, JPEG-DPCM offers a choice of 7 predictors, using either 1 or a combination of 2 or 3 contextual samples, and JPEG-LS uses a combination of 4 contextual samples. In some predictors, the numerical combination is based on analysis of the entire image (for example, by one-dimensional or separable two-dimensional autoregression) or a substantial portion thereof, such as a scanline (for example, by trying several different ones and choosing the one yielding the highest compression); in others, the numerical combination is fixed, on the basis of similar analysis of representative data.
JPEG-DPCM was internationally adopted in 1994 as part of ISO/IEC DIS 10918-1. JPEG-LS, formerly known as LOCO, and currently under consideration as a standard by ISO/IEC JTC1/SC29/WG1, is described in U.S. Pat. No. 5,680,129, “System and Method for Lossless Image Compression”, by Weinberger, Seroussi, and Sapiro. PNG was internationally adopted in 1996 by the World Wide Web Consortium. The LZW technique on which TIFF-LZW is based is taught by U.S. Pat. No. 4,558,302, “High Speed Data Compression and Decompression Apparatus and Method”, by Welch.
Lossy image compressors of prior art, in contrast, commonly compress an entire tile, or group of pixels, at a time, by comparing it to other tiles. In vector-quantization compressors, a distance measure is used to compare the tile to a set of known tiles, such as prior tiles or tiles in a codebook, and a single best representative is chosen. For example, in the Pyx moving-image compressor, as disclosed in U.S. Pat. No. 5,734,744, “Method and Apparatus for Compression and Decompression of Color Data”, by Wittenstein, Hourvitz, et al., tiles which have changed sufficiently from the base frame and the previous frame are encoded by the index of the closest-matching tile in a dynamic tile table. In transform compressors, on the other hand, a correlation measure is used to compare the tile to a set of abstract basis tiles, and encoded as a numerical combination of some of those basis tiles. For example, the current lossy still-image compression standard, JPEG-DCT, and the current lossy moving-image compression standard, MPEG, both use a digital cosine transform, in which the tile is encoded as a weighted combination of cosine tiles, where the weights are determined from the correlation of the encoded tile with the cosine tiles. JPEG-DCT was internationally adopted in 1994 as part of ISO/IEC DIS 10918-1.
With care, a transform compressor can be made perfectly reversible, in which case the method can be applied for lossless compression. For example, the wavelet compressor disclosed in U.S. Pat. No. 5,748,786, “Apparatus for Compression Using Reversible Embedded Wavelets”, by Zandi, Allen, Schwartz, and Boliek, can losslessly compress images.
Some lossless image compressors of prior art, such as PhotoJazz, JPEG-LS, and Sunrise, reduce the coding context using topologically constrained splitting or clustering methods, in which contiguous neighborhoods of values in native context space are mapped to single values in the reduced space.
One of the chief disadvantages of existing prior-art lossless and lossy image compression schemes alike is their slowness. Even on top-of-the-line personal computers, existing lossless compressors, including PNG, JPEG-LS, PhotoJazz, STiNG, and JPEG-2000 are currently 10 to 20 times too slow for standard-definition video rates such as NTSC and PAL. Likewise, existing high-quality lossy compressors such as PhotoJPEG, MotionJPEG, DV, and Sorensen, are 3 to 100 times too slow at their highest quality settings for real-time editing on commodity hardware. For high-definition video, prior-art compression schemes for photoquality images are tens or hundreds of times too slow.