The present invention relates generally to raster image compression and decompression.
Image compression is not a new technique. Many different schemes have existed for many years, and include such techniques as run length encoding, Huffman style variable bit length encodings, and mathematical transformation methods. Each of these achieves a varying degree of compression ratio (i.e., the ratio of the original image size to the compressed image size). Some, especially the mathematical transformation methods, cause loss of data to some degree: the compressed image cannot be decompressed to yield an exact, bit-for-bit copy of the original image.
1. Run Length Encoding. This method makes one pass through a data file, looking for repeated pixel values (i.e., colors). Repeated sequences are represented in the output file using a descriptor which contains the "value" and a "count" of the number of times the value is repeated sequentially at that point within the image. A limitation of this technique is that it does not work well on raster graphics animations unless the images are fairly simplistic.
2. Huffman Encoding. This method uses the well-known Huffman algorithm, which makes two passes over the data file. The first pass is used to generate a population count of individual bytes within an image. The second pass replaces each byte with a token whose size in bits is based upon the population counts determined in the first pass. Typically, the most populous byte is replaced with a single bit in the output file, the tokens used for succeedingly less populous bytes are based upon an algorithm which requires that the encoding for a particular value be unique such that no token can be a subset of another token. Limitations of this technique include: a) this algorithm is primarily useful for text files; it typically compresses raster image files by a factor of 2:1; b) the algorithm is slow, requiring two passes through the data; and c) the algorithm is inherently limited to a maximum compression of 8:1.
Typical compression ratios are in the 10:1 range for best case images using any of the common techniques. Huffman encodings have the additional disadvantage of requiring time consuming bit shifts and masks at the 1-bit level on both compression and decompression. This makes them less suitable for speedy processing.
None of these techniques typically exploit the particular characteristics of animated sequences. Nor do they to any great degree exploit modern computer architectures for vector or parallel processing. Thus there is a major need in the field for a modern algorithm which is lossless, tailored for high spatial and color resolution images, specifically designed to exploit the properties of raster animations, and which targets modern generations of supercomputers and workstations.
United States patents of interest include Tanaka U.S. Pat. No. 4,807,029, which describes a 2D mathematical transformation technique, followed by encoding for image compression. The precompression 2D transformation is a Hadamard orthogonal transformation applied to sub-image blocks of M.times.N pixels, which produces M.times.N blocks of transformed values. The compression encoding of the images uses an M.times.N bit allocation table which specifies the number of bits to retain for each of the M.times.N transformed values; this table allocates zero bits for block cells farther than an arbitrary threshold R from the zero frequency term (1,1) (that is, for cells (m,n) where [(m-1).sup.2 +(n-1).sup.2 ].sup.1/2 &gt;R). The transformation step and bit allocation both introduce loss of information, preventing the exact reconstruction of the original image. This is the case for the general class of mathematical transformations applied to image compression.
Patent numbers Music et al U.S. Pat. Nos. 4,847,677 and 4,857,991 describe compression using run length encoding; line-to-line and frame-to-frame coherence are employed to reduce image size. RGB color values of an image are converted to color lookup table (palette) form as part of the compression using a popularity algorithm which selects the most frequently occurring colors (and rejects the least frequently occurring colors). Palette selection can cause loss of color information; since it is an integral part of compression, the compression is lossy.
These patents primarily describe an invention for transmission of video signals rather than computer generated images; speed of encoding, transmission, and decoding are therefore paramount. Loss of data is allowed. The run length encoding is strictly one dimensional, in keeping with the nature of a video signal. Additional signal processing, for example, the calculation and encoding of luminance information, is used to reduce the effects of noise. Calculation of the color lookup table and of luminance includes arbitrary thresholds which can be fixed or allowed to vary adaptively. Values which differ by an amount greater than the threshold are considered distinct; otherwise, the values are considered to be the same. All of these techniques are mechanisms to identify and reject image components subjectively considered insignificant to overall image quality: they do not provide for the exact reproduction of the original images.
Patent number Ericsson U.S. Pat. No. 4,816,914 describes a block transform encoding of images, together with motion detection to Permit frame-to-frame coherence to reduce the size of the encoded image sequence, The block transform computes discrete cosine transform coefficients for sub-image blocks of pixels; the coefficients are discretized to reduce the number of bits of precision, and subjected to a threshold value to reject coefficients of sufficiently small value. Each of these steps introduces loss of information. The quantization step is actively adjusted by means of motion prediction prior to encoding the frame. The encoding is performed with a quad-tree technique to select those coefficient values to represent the image.
Patent numbers Gonzales et al U.S. Pat. No. 4,725,885 and Anastassiou et al U.S. Pat. No. 4,369,463 describe a grayscale quantizer and entropy encoder for monochromatic images, using differential pulse code modulation techniques. The quantizer maps the full range of grayscale values to fewer values by dividing the full range into subsets of consecutive values, each subset being represented by a single quantized number. For example, if the grayscale range is 0 to 255, then a subset could be the range 1 to 17, represented by a quantized value of 12. The quantization causes loss of information by causing originally distinct pixel values to be represented by a single quantized value, Preventing the reconstruction of the original values. Eliminating the quantization loss requires bypassing the quantization step, resulting in a significant reduction in the degree to which an image is compressed.
The quantized pixels are encoded using a Huffman style encoder, in which a varying number of bits is used to represent the quantized values. Information about the neighboring pixels is used to determine the surrounding context in which the pixel occurs, and to determine the extent to which the pixel value deviates from the surrounding context in both sign and magnitude. The quantization step is adaptive based on the surrounding context. The use surrounding context is not the same as applying line-to-line coherence; frame-to-frame information is not used. These techniques are designed for the transmission of video signals, where loss of information is acceptable.
Patent number Moorhead et al U.S. Pat. No. 4,717,956 describes an image compression technique using motion compensation apparatus. Motion compensation involves predicting the displacement of an object point of a video scene, predicting the intensity value at the current pixel, and calculating the intensity difference to correct the initial displacement estimate. Information from neighboring pixels of one image, and from previous images, are used in the prediction step. More specific details about the encoding format for the output bit stream and the expected compression ratio are not given.
Patent number Campbell et al U.S. Pat. No. 4,580,134 describes an apparatus for generation of images within video games and other such devices. It specifically targets the circuitry in which an image is generated and stored, and the circuitry by which the video signal to drive a monitor is derived. This patent does not directly apply to the compression and encoding of images of arbitrary external sources.
Patent number Windergren et al U.S. Pat. No. 4,302,775 describes a block transform method for compression of digital images with adaptive control over the normalization of the number of retained bits of the transformation coefficients which represent the image. This technique addresses television signals, and a video broadcast system is described. The transform which has been chosen is the discrete cosine transformation in which the constant term coefficient is always transmitted with a fixed number of bits, and for which the low order coefficients are Huffman encoded with a predetermined Huffman table and long strings of zeros of the high order coefficients are run length encoded. The apparatus targets real time NTSC video frame rates for compression and decompression. The techniques of this patent are lossy, as is the case for general mathematical transform techniques.
Patent numbers Sakamoto et al U.S. Pat. No. 4,070,694 and Morrin, II U.S. Pat. No. 3,987,412 describe compression apparatus for compressing and transmitting 1 bit per pixel scanned images such as facsimile transmissions. Each pixel is represented by either 0 (black) or 1 (white). The algorithms and apparatus do not directly apply to grayscale or color images in which complexity increases exponentially over images at 1 bit per pixel.
In summary, the patents reviewed above all use one or more of the techniques of one dimensional run length encoding, Huffman encoding, discrete cosine transformations, and variable precision numeric representation.
The following papers are representative of the state of the art:
Huffman, David A.; "A Method for the Construction of Minimum-Redundancy Codes", Proceedings of the I.R.E., pages 1098-1101, September 1952.
Grosskipf Jr, George; "Generating Huffman Codes", Computer Design, pages 137-140, 1983.
Arps, Ronald A.; "Binary Image Compression, Advances in Electronics and Electron Physics", Supp. 12, pages 219-275 (Academic Press, Inc., 1979).
Haskell, Barry G.; "Frame Replenishment Coding of Television", Advances in Electronics and Electron Physics, Supp 12, pages 189-217 (Academic Press, Inc., 1979).
Tescher, Andrew G.; "Transform Image Coding", Advances in Electronics and Electron Physics, Supp. 12, pages 113-155 (Academic Press, Inc., 1979).
Welch, Terry A.; "A Technique for High-Performance Data Compression", Computer, pages 8-19, (IEEE, 1984)
Rissanen, Jorma; "A Universal Data Compression System, IEEE Transactions on Information Theory", Vol. IT-29, No. 5, pages 656-664, September 1983.
Einarsson, Goran and Roth, Goran; "Data Compression of Digital Color Pictures", Computer & Graphics, Vol. 11, No. 4, pages 409-426, 1987.
Walker, P. A. and Grant, I. W.; "Quadtree: A Fortran Program to Extract the Quadtree Structure of a Raster Format Multicolored Image", Computers U Geosciences, Vol. 12, No. 4A, pages 401-410, 1986.