The present invention relates to improvements for generating a minimum length code word stream for efficient transmission or storage of two dimensional gray scale image data utilizing the concepts of adaptive differential pulse code modulation.
Gray scale images are stored as rectangular arrays of numbers corresponding to the intensity values at each picture element (pel). Usually, each intensity value is given by a byte (8 bits). This allows 256 possible values which are enough for excellent image representation if the sampling is sufficiently dense.
Coding for data compression is needed to reduce the length of the bit-stream representing the image in cases of limited storage or transmission capacity. The compression ratio can be greatly increased if one allows "lossy compression", i.e., if the image that the decoder reconstructs from the received bit-stream is not identical to the original one. For most practical purposes, such a distortion is allowed if it can barely be detected by an observer. A complicated trade-off exists between the achievable compression, the complexity of the coding algorithm, and the distortion between the original and reconstructed images.
Most algorithms known in the art for gray scale compression are forms of either Transform coding or Differential Pulse Code Modulation (DPCM). For large compression ratios, DPCM is considered to give a worse quality of reproduced image but is less complex than Transform Coding. The reason for the inferior quality of DPCM is mainly that each pel is encoded using the same number of bits (which cannot be less than 1), while images contain some segments in smooth areas that need less than 1 bit/pel for adequate coding.
In conventional DPCM schemes the image is encoded recursively, in raster scan order, usually from left to right. Thus, when the "current pel" is encoded, all pels of the previous scan lines, as well as all pels to the left of the current pel in the same scan line, have been encoded; this means that the decoder, when reaching the location of the current pel, will have access to all intensity values of the reconstructed image at these history pels. In FIG. 1, four of these locations are shown. The values A, B, C and D are used to denote the reconstructed intensity values (different from the actual ones which are unknown to the decoder). The actual intensity value of the current pel is denoted by X, while the reconstructed one is denoted by Y. The upper or lower-case subscripts a b, c and d will be used to imply the locations of history pels with reconstructed intensity values A, B, C, D respectively, while the upper-case subscript X will imply the location of the current pel.
A fixed number, K, of bits is used to encode the value of the current pel by quantizing it into one of 2.sup.K possible quantization levels centered around a predicted value. A typical simple choice for the predictor (PRED) is: EQU PRED=A+(C-B)/2 (1)
A DPCM algorithm of, say, 2 bits/pel will encode the value X into the quantized one Y which will be the one among PRED-Q.sub.2, PRED-Q.sub.1, PRED+Q.sub.1, PRED+Q.sub.2 which is closest to X. The quantities Q.sub.1 and Q.sub.2 are known and specify the quantization characteristics. (See FIG. 2).
A set of quantization characteristics around the predicted value which gives excellent quality encoding using a 5 bit DPCM system for an original 8 bit/pixel image is 0, 2, 5, 8, 13, 20, 27, 36, 47, 59, 72, 86, 100, 116 and 128 (with one sign bit and 4 magnitude bits) as reported by Sharma, D. K. and Netravali, A. N. (1977) IEEE Trans. Commun. Com-25, 1267-1274. These values should be considered as illustrative only. If the whole range of 256 gray levels is used and they are normalized according to the human visual system discriminatory capability they could be arranged on a coarser scale. These particular values are based on psychophysical experiments.
Many extensions of DPCM schemes have been proposed in which the predictor and/or quantizer are adaptive to the local characteristics of the image (based on information from the image history, which must, of course, be known to the decoder). The justification for using an adaptive approach is that in areas that are not smooth, nonlinear predictors work better than simple ones like Eq. (1); also the discriminatory capability of the visual system is reduced in non-smooth areas allowing a coarser quantization than one which is suitable for smooth areas. These adaptive schemes require at least 4 bits/pel (the 8 quantization levels per pel that 3 bits would provide are insufficient) for an excellent quality coding.