One technique of compressing and de-compressing data is known as arithmetic coding. Arithmetic coding provides that, during encoding, the results of a set of events are made to correspond to a respect point on a number line and, during decoding, the results from the set of events can be re-determined from a knowledge of the respective point. Specifically, during encoding the occurrence of a first answer for an event (or decision) is related to a corresponding first interval along the number line. The occurrence of a second (subsequent) event is associated with a subinterval along the first interval. With additional events, successive subintervals are determined. The final subinterval is represented by a selected point, which is defined as a compressed data stream. For a given set of answers for a respective set of decisions, only one subinterval with one corresponding compressed data stream is defined. Moreover, only one set of answers can give rise to a given compressed data stream.
Hence, given the compressed data stream, the original set of answers can be determined during decoding.
When the possible number of answers which can occur at any given event is two, the arithmetic coding is binary. One example of a binary application is embodied in the processing of white/black data of a picture element (pel) of a facsimile processing system. In this facsimile application, there are two complementary probabilities referred to herein as "P" and "Q". Given that a pel can be either black or white, one probability corresponds to a pel being black and the other corresponds to the pel being white. Such an environment is discussed in an article by Langdon, "An Introduction to Arithmetic Coding", IBM J. Res. Develop. Vol 28, No 2, pages 135-149 (March, 1984).
As noted in the article, one of the two possible answers in binary coding may be more likely than the other at a given time. Moreover, from time to time, which of the two answers is more likely may switch. In a black background, for example, the probability of the "black" should be significantly greater than for the "white" answer, whereas in a white background the "white" answer should be more likely. In the Langdon article, an approach is proposed which may be applied to a binary arithmetic code or, more generally, to a multisymbol alphabet environment. In each case, the approach involves processing numerical data in the real number system. Specifically, the encoding process is described as defining a variable C (representing a code point along the number line) and an expressing C+A (representing an available space starting at the current code point C and extending a length A along the number line). The interval A is sometimes referred to as the range. In the binary context, Langdon represents the variable A in floating point notation.
Other references of a related nature include two patents, U.S. Pat. Nos. 4,467,317 and 4,286,256. An article by Rissanen and Langdon, "Arithmetic Coding", IBM J. Res. Develop. Vol 23, No 2, pages 149-162 (March 1979), also discusses arithmetic coding. In the Langdon and Rissanen-Langdon articles and in the above-referenced patents, which are incorporated herein by reference, additional patents and publications are discussed which may serve as further background to the present invention.
In reviewing the various cited references, it is observed that logarithms are used to represent the measure of entropy H(S) of a symbol and to represent a measure of the width of the code space. The use of logarithms in converting computationally expensive multiplies and divides into additions and subtractions is well known. A review of prior technology also indicates that no references disclose an encoder/decoder in an arithmetic coding system wherein range is re-computed for successive subintervals with finite precision in the logarithmic domain rather than in the real number domain.
The use of logarithms for re-computing range is not straightforward because of the precision requirements of arithmetic coding. Specifically, when transferring from the real number domain to a logarithm domain in which there is finite precision, the resulting logarithm includes some truncation due to the finite precision requirement. A problem attending the truncation involves the antilogs that are to be subsequently taken. Due to the truncation, it is possible for the sum of the antilogs of log P and log Q to exceed one--resulting in ambiguity which cannot be tolerated in the arithmetic coding. Stated otherwise, the truncation operation must be performed to assure decodability. Until the present invention, this feature has not been achieved with a log coder.