1. Field of the Invention
The present invention relates to an arithmetic coding technique, and more particularly, to an arithmetic coding technique with low computational complexity and a low cost hardware implementation.
2. Description of the Related Art
The compression of data into fewer bits is often utilized in order to achieve a desired rate of data of transfer or to store data in a limited memory space. Some time after the data is compressed, the original data is to be retrieved by decompressing the originally compressed data.
Arithmetic coding is one technique for achieving data compression and decompression. In arithmetic coding, one decision after another is encoded to define successively smaller, lesser-included intervals along a number line. Arithmetic coding provides that each decision has a plurality of possible exclusive outcomes or events. Each outcome or event is represented by a symbol.
In accordance with prior art arithmetic coding teachings, a probability line has a current interval defined therealong. The first current interval is 0 to 1. The current interval is divided into segments in which each segment corresponds to one possible outcome for the next decision. Where there are only two possible outcomes for each decision, the current interval is divided into two segments. The length of each segment is based on its respective associated probability. The respective probabilities may remain fixed or may adapt as decision data is entered.
The power of arithmetic codes resides in the compression they achieve and the flexibility of arithmetic codes resides in their ability to encode strings modeled by stationary or non-stationary sources with equal ease. Arithmetic codes permit encoding of binary strings without the need for alphabetic extension, and also encode strings with symbols drawn from alphabets containing more than two characters.
An arithmetic code updates the probability P(s) of a so-far processed source string s, by the multiplication P(s)*P(i/s) where P(i/s) is a conditional probability of the symbol i given s. The multiplication required for augmenting the probability is relatively expensive and slow, even in the case where the probabilities are represented by binary numbers having at most a fixed number of significant digits.
U.S. Pat. No. 4,467,317 implements a recursive technique which simplifies the encoding operation of a binary symbol stream by approximating the probability of a less probable signal (LPS) with an integral power of one-half. However, this technique is not easily generalized to non-binary alphabets because n-ary alphabet symbol probabilities cannot be accurately approximated as powers of one-half. This limitation is significant in view of the growing parallelism in data structures and operations in present day processors. Further, the recursive technique depends upon the existence of a sophisticated modeling unit capable of calculating and providing source statistical characteristics. Implicit in the calculation of source statistical characteristics is a default of a skew number calculation operation to the modeling unit, an operation which adds to the complexity of the overall data compression problem.
U.S. Pat. No. 4,652,856 discloses an arithmetic coding technique which is capable of coding a multi-character alphabet, without multiplication or division in order to generate a binary code stream in response to simplified data source statistical characteristics that preserve the very desirable property of concurrent updating of the both the code stream and an internal coding variable. The 4,652,856 Patent is based upon an encoding algorithm that accepts symbol occurrence counts in binary form, subjects those counts to simple magnitude shift operations to approximate the contextual probability of each symbol of the symbol stream, augments the coded stream, and then simultaneously locates the w least significant bits of the code stream and adjusts the value of an internal coding variable determinant of the next coding sub-interval in response to one of the shifted occurrence counts.
Arithmetic coding is a statistical compression method and thus must work with a model that estimates the probability of each possible symbol in the modeled source. The model can be fixed, semi-fixed or adaptive. A fixed model has a probability distribution P.sub.1, P.sub.2, . . . , p.sub.n, where 1, 2, . . . , n are the symbols in the modeled source.
Arithmetic coding works based upon a interval transformation, which transforms one interval [l.sub.k, h.sub.k) into another interval [l.sub.k +.sub.1, h.sub.k +1), provided that a symbol i is encoded, where ##EQU1## r.sub.k+1 =h.sub.k+1 -l.sub.k+1 (3)
r.sub.k denotes the range of the interval [l.sub.k, h.sub.k). Initially, the encoder starts with the unit interval [0,1). This implies l.sub.0 =0, h.sub.0 =1, r.sub.0 =1.
Arithmetic encoding cannot be implemented on a finite precision computer, since all operations involved in equations (1)-(3) are floating-point math. However, it turns out, arithmetic coding can be best accomplished using standard 16-bit and 32-bits integer operations, Floating-point math is neither required nor helpful. What is required is an incremental transmission scheme.
As a simple example consider the case for n=2, i.e., a binary source employing a 0-order Markov model to estimate the probabilities p.sub.0 and P.sub.1. In this example, equation (1), (2) and (3) can be simplified to EQU r.sub.k+1 =r.sub.k P.sub.0, for i=0 (4) EQU l.sub.k+1 =l.sub.k +r.sub.k p.sub.0,r.sub.k+1 =r.sub.k P.sub.1, for i=1 (5)
where i is the so far encoded symbol.
For a practical implementation of equation (4) and (5), p.sub.0 and p.sub.1 is replaced with frequency counts c.sub.0 and c.sub.1, which may be adaptively adjusted during the encoding process. EQU p.sub.0 =c.sub.0 /(c.sub.0 +c.sub.1),p.sub.1 =c.sub.1 /(c.sub.0 +c.sub.1) (6)
In conventional designs for arithmetic coding, the encoder uses two registers of size 16 bits, L and R, to store the fraction value of l.sub.k and r.sub.k respectively. Combining (4), (5) and (6) yields EQU L:=L, R:=Rc.sub.0 /(c.sub.0 +c.sub.1),i=0 (7)
L:=L+Rc.sub.0 /(c.sub.0 +c.sub.1), R:=Rc.sub.1 /(c.sub.0 +c.sub.1),i=1 (8)
where m-bits integer operations are involved, and m-bits integer rounding is used (where m is a typical hardware register size, such as 8, 16, or 32). Most algorithms now available are based upon equation (7) and (8) or their variations. Note that equations (7) and (8) include integer multiplications and divisions and therefore, are not suitable for hardware design, such as digital signal processor (DSP) and Smart card implementations, where no division instruction is available.