1. Field of the Invention
The present invention relates to bandwidth reduction techniques for encoding images which preserves image edges with high contrast.
2. Description of the Prior Art
Many bandwidth compression techniques which have been applied to processing images can be characterized as low pass filters. High bandwidth compression rates yield images with reduced spatial resolution or sharpness. In general, the sharpness in an image is a function of high contrast edges.
Extensive studies on the effects of edges on the sensitivity of human visual perception to luminance differences have been performed. See T. N. Cornsweet, Visual Perception, New York: Academic Press, pp. 270-276, 1970; E. Aulhorn and H. Harms, "Visual Perimetry," in Handbook of Sensory Physiology, Vol. VII/4, D. Jameson and L. M. Hurvich, Ed., New York: Springer-Verlag, 1972; and C. H. Graham, "Visual Form Perception," in Vision and Visual Perception, C. H. Graham, Ed., New York: John Wiley, 1965. The implication is that the presence of these edges is important to overall subjective quality of the image but their fidelity is not. This characteristic has been used in the development of adaptive quantizers for image compression. See A. N. Netravali and B. Prasada, "Adaptive Quantization of Picture Signals Using Spatial Masking", Pro. IEEE, Vol. 65, pp. 536-548, April 1977.
Research results from psychophysical and physiological investigations of the human visual system have been used to develop a model of the human visual system and to apply it to the encoding problem. See C. F. Hall and E. L. Hall, "A Nonlinear Model for the Spatial Characteristics of the Human Visual System," IEEE Trans. Systems, Man and Cybernetics, Vol. SMC-7, No. 3, pp. 161-170, March 1977; C. F. Hall, Digital Color Imaqe Compression in a Perceptual Space, Ph.D. Dissertation, University of Southern California, USCIPI Report 790, February 1978; and C. F. Hall, "The Application of Human Visual System Models to Digital Color Image Compression," Proc. IEEE International Conf. on Communications, Boston, 1983. The basic objective of the aforementioned three studies was to put any encoding noise where it could not be seen. When difference images of the original and encoded/decoded images were computed, the noise was concentrated around the high contrast edges.
Several bandwidth compression techniques have been applied to the processing of images. See C. F. Hall, Digital Color Image Compression in a Perceptual Space, Ph.D. Dissertation, University of Southern California, USCIPI Report 790, February 1978; C. F. Hall, "The Application of Human Visual System Models to Digital Color Image Compression," Proc. IEEE International Conf. on Communications, Boston, 1983; E. L. Hall, Computer Image Processing and Recognition, New York: Academic Press, 1979; A. Habibi and P. A. Wintz, "Image Coding by Linear Transformation and Block Quantization," IEEE Trans Commun. Tech., Vol. COM-29, No. 1, pp. 50-62, January 1971; A. N. Netravali and J. O. Limb, "Picture Coding: A Review," Proc. IEEE, Vol. 68, No. 3, pp. 366-406, March 1980; and A. K. Jain, "Image Data Compression: A Review," Proc. IEEE, Vol. 69, No. 3, pp. 349-389, March 1981.
The encoding process can be thought of as a three-step process which involves a mapper, a quantizer, and an encoder. The purpose of the mapper is to transform the pixel data into another domain where the efficiency of the quantizer is enhanced, thus fewer bits are required to encode the data. The quantizer performs the bit reduction task by assigning the mapped data to a smaller number of possible values than contained in the input. Finally, the encoder assigns a code word to the quantizer output values.
The time discrete, amplitude discrete representation of data is referred to as pulse code modulation (PCM). This technique in its simplest form, involves the sampling of an analog signal at a uniform rate (the mapper), mapping these samples to one of N equally spaced values (the quantizer), and assigning a unique binary representation to each possible quantizer value (the encoder). This technique requires 6-7 bits per pixel for most images. PCM usually proceeds the more sophisticated forms of encoding and may, for example, require 8 bits to encode individual pixels. The PCM technique is simply an analog to digital conversion and is relatively inefficient since no attempt is made to use any redundancy in the data. Imagery is highly correlated and several types of mappers designed to take advantage of this redundancy have been used such as differencing, orthogonal transforming and run length encoding.
The potential value of encoding pixel differences is apparent when a histogram of adjacent pixel differences is computed. Input amplitudes for a typical image may range up to 256 gray levels, whereas, the difference range is about 16 levels. Thus, the possible reduction in word size alone can yield a 2:1 compression. A practical implementation of this approach is the differential pulse code modulator (DPCM). In DPCM the difference between the current pixel value and a predicted value is quantitized and encoded. The mapper in this case consists of a predictor and a differencing operation. Several variations are possible. The predictor may be one-, two- or three-dimensional, linear or non-linear, adaptive, or nonadaptive, and use one or more pixels in each direction to form the estimate. For single frame images, a simple, two-dimensional, linear, nonadaptive predictor based on the previous pixel and the pixel above the current value to be predicted works quite well. DPCM is limited to a minimum average bit rate of one bit/pixel (sometimes referred to as Delta Modulation).
The linear transformation mappers have a minimum rate restriction based on block size. The image is partitioned into sub-images and each sub-image is transformed into a block of coefficients which are uncorrelated. This permits quantization of each coefficient on an independent basis. Transformations which pack information into a small number of coefficients make large rate reductions possible. The discrete Fourier and cosine transforms have been used and rates as low as 0.1 bit/pixel have been reported. See C. F. Hall, Digital Color Image Compression in a Perceptual Space, Ph.D. Dissertation, University of Southern California, USCIPI Report 790, February 1978 and C. F. Hall, "The Application of Human Visual System Models to Digital Color Image Compression," Proc. IEEE International Conf. on Communications, Boston, 1983. Most transform encoders delete high frequency coefficients which have low information content (usually established by some type of variance criterion). As a result, they can be modeled as low pass filters. With a run length encoder as a mapper, the sequence of pixel's values along a scan is mapped into a sequence of pairs. Each pair, in sequence, denotes the current gray level value and the number of continuous pixels (run length) with that value. Highly correlated data produces long run lengths and concomitant rate reductions. This procedure works well on bi-level imagery which contains large runs of black and/or white (for example, printed text or fingerprints).
A quantizer as a device which forces each input value to one of a limited number of output values. The optimal design, based on the statistics of the data and mean square error (mse), is the MAX quantizer. See J. Max, "Quantizing for Minimum Distortion," IRE Trans. Info. Th., Vol. IT-6, No. 1, pp. 7-12, March 1960. If the data are uniformly distributed, the reconstruction levels are equally spaced. Other distributions will yield output levels that have the smallest step sizes and the most probable value reasons. Within the DPCM encoder, the major portion of the compression is obtained in the quantizer stage since the different signal is encoded with fewer bits than the input signal contained. Three types of degradations can be generated in the imagery as a result of this approach; granular noise, edge busyness, and slope overload. If the quantizer's steps are too large, the coarse quantization will add random (or granular) noise to regions of constant gray level. If the step size is made small, to minimize this problem, the high contrast edges will require several samples for the output to follow the input. This is referred to as slope overload and it results in smooth edges or low pass filtering effect. Edge busyness occurs when the contrast of an edge changes slowly and the quantizer output dithers about the input value. The selection of a fixed step size usually requires a compromise in one or all of these areas. Ideally, one would like a small step in constant gray in relatively small areas and larger steps in high contrast areas. The probability density for the error signal in DPCM can be approximated by Laplacian density. The optimal quantizer in terms of mse has steps that increase in size as the input increases in magnitude. Thus, granular noise and slope overload effects are minimized. It should be noted that the number of output levels remains fixed for all mapper output values in the DPCM case.
Unlike DPCM, transform encoders use multiple level quantizers. Indeed, the bulk of the compression realizes through not transmitting any value (or at least zeros) for a large number of transform coefficients. Given the total number of bits to be used, the idea is to allocate them in a way that minimizes total distortion. The problem becomes one of selecting the best bit map. Once the bit allocation is determined, a MAX quantizer can be designed for each of the transform coefficients. The Gaussian density is a good model for most coefficients, except the DC term which is more appropriately a Rayleigh.
A good synopsis of the work reported in the literature is set forth in A. N. Netravali and J. O. Limb, "Picture Encoding: A Review," Proc. IEEE, Vol. 68, No. 3, pp. 366-406, March 1980 and A. K. Jain, "Image Data Compression: A Review," Proc. IEEE, Vol. 69, No. 3, pp. 349-389, March 1981.
U.S. Pat. Nos. 4,096,526, 4,096,527, 4,344,086, 4,420,771, 4,476,495 and United Kingdom patent 2,035,747 disclose various bandwidth reduction techniques. U.S. Pat. Nos. 4,096,526 and 4,096,527 address the problem of handling flag bits in a run length encoding system. U.S. Pat. No. 4,344,086 discloses a predictive encoder for generating an error output signal which is subjected to run length encoding. U.S. Pat. No. 4,420,771, which is similar to U.S. Pat. No. 4,344,086 discloses a DPCM encoder which produces an error signal which is subjected to run length encoding. United Kingdom patent 2,035,747 discloses separate run length encoders for each bit plane.