Data compression generally refers to the encoding of information for reducing the number of symbols required to identify the information and thereby achieving an economy on storage and time needed to transmit the information.
In lossless data compression, the original information is fully recoverable from the compressed data. Full recovery is important for information such as pictures from medical radiology or satellites where it is difficult, if not impossible, to retake the pictures, but where the original resolution of the pictures are needed for future analysis and processing.
Rissanen and Langdon, "Compression of Black-White Images with Arithmetic Coding", IEEE Transactions on Communications, Vol. COM-29, No. 6, June 1981, disclose a method of losslessly compressing black-white pictures (i.e. information consisting of bilevel signals). The method separates a compression system into a modeling unit and a coding unit. In the modeling unit, a set of statistical characteristics of the information is selected as events for which relative probabilities are gathered. The information is encoded based upon the relative probabilities.
To improve compression, Rissanen and Langdon further disclose conditioning the probabilities using contexts of prior events.
The idea of conditioning can be illustrated by considering a card game in which the card deck is drawn without replacement. As the cards in the deck are drawn, the odds are changed according to the probabilities of the events conditioned to a state defined by the previously drawn cards.
Conditioning can also be illustrated by considering the modeling unit as a finite state machine which is described by the following equations: EQU x(t+1)=F(x(t),s(t+1)) EQU z(t)=G(x(t))
where
s(t) denotes the t.sup.th symbol in the string s=s(1) PA1 s(2) . . . to be modeled and encoded; PA1 x(t) denotes an intermediate value, or the internal state of finite state machine; PA1 z(t) is one of the K contexts, or conditioning classes, each of which is defined by the states of a set of prior events.
The probability distribution P(y/Z) over a symbol y at each class Z gives the conditional probability that symbol s(t+1) equals y, given that the associated class z(t) is Z.
In the aforementioned method of compressing black and white pictures, the number K of contexts z(t) is equal to 2.sup.n, where n is the number of prior events selected to define a context. Because each signal has 2 values, each context has 2 coding probabilities. In general, if each symbol s(t) has L values and if n prior symbols are selected to define each context z(t), there results in L.sup.n contexts with each context having a distribution of L probabilities.
Todd et al., "Parameter Reduction and Context Selection for Compression of Gray-scale Images", IBM J. Res. Develop., Vol. 29, No. 2, March 1985, disclose a method for losslessly compressing multilevel signals. The event selected therein is prediction error, defined as the difference between a predicted value of a current pel (picture element) and the actual value. Compression is improved by using independent probability distributions each of which is conditioned by context of a subset of neighboring pels. The improvement is attained because most images are usually formed by contrasts between "smooth regions" and "edges". Since the probability for predicting a pel value in a "smooth region" is different from that in an "edge", compression is thus improved when probabilities are conditioned by neighboring pels.
For a gray scale image having L-level signals, the values of prediction errors lie between -L and +L. If the aforementioned method for compressing black and white pictures is straightly applied, the number of probabilities that must be stored and/or transmitted to the decoding unit becomes impractically large. Therefore, Todd et al. further disclose combining prediction errors into ranges, or error buckets, and conditioning such probability distributions by contexts of selected sets of neighboring pel error buckets.
It should be noted from experience that the frequencies of prediction errors roughly approximate a distribution as shown in FIG. 5. Because a greater portion of most pictures usually falls under the "smooth" region, it is expected that the difference in intensity between a pel and its neighbors, hence the prediction error, is usually small.
In general, the efficiency of a compression system depends on the accuracy of the relative probabilities which describe the events. Partitioning prediction errors into buckets would necessarily compromise the accuracy of the probability distribution and would, in turn, cause a degradation of the compression efficiency. To minimize the degradation, only a small number of prediction errors should be put into a bucket when the frequencies of such prediction errors are high. On the other hand, for less frequent prediction errors, the effect on compression efficiency would remain small even if more such errors are put into a bucket. By partitioning all prediction errors into predetermined buckets of equal size, Todd et al. therefore fail to minimize the degradation due to partitioning. In the case where the bucket sizes are optimized, the method according to Todd et al. suffers a loss in speed because it needs a first pass to determine the bucket ranges before a second pass can be performed for encoding the data.
According to the above-mentioned compression method by Todd et al., encoding a signal requires a determination of the bucket in which a prediction error belongs. Such determination may be performed, for example, by locating, through a table look-up process on the error value, a bucket number. The bucket number is then used in conjunction with the appropriate context to locate an appropriate codeword or coding parameter based on the probability distribution. For images having, say, 12-bit pels, the table look-up process involves a table of 8K words (twelve bits for the magnitude of a prediction error together with and one sign bit). In some data compression applications (e.g. satellite pictures), there is a transmission time and/or response time requirement the fulfillment of which would be impeded by such table look-up process. Thus, there is a need for a method of compressing multilevel data wherein the steps required to code a signal can be minimized without significant compression loss.