Entropy coding is a technique of lossless data compression which achieves compact data representation by taking advantage of the statistical characteristics of the source. Today, the most practiced entropy coding technique is probably variable-length coding (VLC). VLC translates input data, often in a fixed length format, into codes of variable lengths. The basic concept behind variable-length coding is to represent more frequent data by shorter codes and less frequent data by longer codes. Therefore, the average code length is expected to be shorter than that of the fixed length representation.
VLC is widely used in various data compression applications and has been adopted in many international standards such as facsimile, still picture, and video coding. The VLCs used by these standards mostly utilize only zero-order or low-order statistical characteristics of the source.
A well-known theorem in information theory implies that the coding efficiency can be improved by utilizing high-order statistics. However, due to the high complexity associated with high-order statistics and the lack of high-speed hardware to implement it, high-order entropy coding has not been practical in the past. In other words, high order conditional entropy coding has not been practical due to its high complexity and the lack of hardware to extract the conditioning state efficiently. High-order conditional entropy coding can also be viewed as an entropy coding system with multiple code tables, adaptively selected according to the state.
The high-order entropy of a source can be effectively exploited by using either the joint probability of L symbols or the conditional probability. However, both techniques would lead to very high complexity. Consequently, a simple but effective technique called run-length coding has been popularly used to exploit limited high-order redundancy. Run-length coding converts a string of the same symbols that occur consecutively into a new symbol that indicates the string length coding for further data compression. The combination of run-length coding and variable-length coding has been shown to achieve high performance for image applications. The entropy coding techniques adopted in most standards are variations of combined run-length coding and variable-length coding.
The benefit of using high-order statistics has been clearly demonstrated in Langdon and Rissanen's work on black/white document compression. In this work, the conditioning state was defined as the combination of 10 pixels in the causal region and a total of 1024 states were generated from all possible combinations of the 10 pixels. The associated coding technique was arithmetic coding which allows the use of fractional bits to code a sample. The drawback of this high-order conditional probability approach is the requirement of a large memory to store all the measured conditional probabilities. When this approach is extended to gray-level images or cases that utilize even higher-order statistics, the memory requirement becomes formidable.
The statistical nature of sources with independent samples can be fully exploited on a sample-by-sample basis. However, for correlated sources, high-order statistics have to be utilized to exploit the redundancy among the data. The minimum bit rate for representing a source is governed by the entropy of the source. For stationary sources, the first order entropy, H(X), is defined as: ##EQU1## where p(x) is the probability of the source symbol x. The entropy H(X), in bits per symbol, establishes a lower bound on the average bit rate for the source X. This lower limit is achieved if each symbol can be coded by the ideal word length, -log.sub.2 p(x). This is usually not the case for VLC since -log.sub.2 p(x) may not be an integer. On the other hand, arithmetic coding allows one, in effect, to represent a symbol in fractional bits and results in a bit rate very close to the entropy. In practice, either technique can be used, whichever is more appropriate.
The U.S. Pat. No. 5,173,695, to Sun et al, discloses a high speed, parallel variable length decoder wherein VLC decoding is achieved in one clock cycle regardless of the code length.
Recently, a technique called incremental tree extension was developed which can substantially reduce the complexity of high-order conditional entropy coding. This technique has been modified for 2-D signals. Nevertheless, the storage requirement is still too high for high-speed implementation.