Arithmetic coding is a well-known technique for lossless coding and an introduction can be found in any current source coding book. For a thorough understanding of the implementations of arithmetic coding that are most relevant for the current work, the reader is referred to [Lang84]. The history of arithmetic coding is nicely described in the appendix of this document. Further, [Howard94] gives an extensive explanation of arithmetic coding.
The implementation of arithmetic coding that is the subject of the present invention uses two finite-size registers, which are usually called C and A. The flow diagram of the encoder operation is shown in FIG. 1. The C register points to the bottom of an interval on the number line, the size of which is stored in A, see, e.g [Lang81] and [Penn88]. The interval is split into sub-intervals, each sub-interval corresponding to a symbol to be encoded and the size of each sub-interval corresponding to the probability of the associated symbol. For actually encoding a symbol, the C register is adjusted to point to the bottom of the sub-interval corresponding to the symbol and the A register is set to the size of the selected sub-interval. The A register (as well as C) is then normalized (left-shifted), before the next symbol is encoded. In general, after re-normalization, the value of A lies between the values k and 2k: k.ltoreq.A&lt;2k. In the present example, k=1/2 will be used.
For example, in the binary case, there are two sub-intervals and thus two possible updates of the C and A registers, depending on whether the bit to be encoded is the most probable symbol (MPS) or the least probable symbol (LPS). It is assumed that the MPS is assigned to the lower interval. The "Update A and C" block of FIG. 1 is shown for the binary case in FIG. 2. The probability of the input bit being the LPS is denoted by p (notice that p.ltoreq.1/2, because the probability of the MPS is .gtoreq.1/2). The input bit to be encoded is denoted by b. The values of b and p are provided by the "Read . . . " block. Now, if a MPS is to be encoded, C does not change, since the lower interval is selected and C already points to this interval. However, A does change and its update is A=A-A.p (using the fact that the probability of the MPS equals 1-p). If a LPS is to be encoded, both C and A are changed: C is updated as C=C+A-A.p and the new interval size is A=A.p. It should further be noted that, by a pre- and post-processing, it can be assured that the MPS is always e.g. the "0" bit and the LPS is always the "1" bit. Finally, FIG. 2 shows an "approximate multiplication" block, because it turns out that the multiplication A.p can be performed with low accuracy, at only a small loss of performance, thus reducing the hardware complexity. Techniques to do the approximate multiplication are discussed later on below.
For the non-binary case, the "Update A and C" block of FIG. 1 is shown in FIG. 3. The "Read . . . " block now provides the symbol to be encoded, s, as well as two probability values: the probability p.sub.s of symbol s and the cumulative probability p.sub.t of all symbols ranked below symbol s. As can be observed from FIG. 3, symbol M is treated differently from the others, in order to exactly "fill" A. It is shown in [Riss89] that it is advantageous to assign the MPS to symbol M.
In order to be able to decode, the decoder must know the value of C, since this determines the symbol that was encoded. So, what is sent to the decoder is the value of the C register. Actually, each time the A register is left-shifted in the renormalization process, the MSB of C (also referred to as "carry bit") is processed for transmission to the decoder. The problem with using a finite-size register for C is that a bit that was already shifted out of C could later have to be adjusted by a carry caused by incrementing C. To take care of this, carry-over control is needed. The state-of-the-art techniques completely solve the problem at the encoder, so the decoder is not affected by this. These solutions, which minimize decoder complexity, will also be discussed later on.
The decoder flow diagram is shown in FIG. 4. For the binary case, the "Output symbol . . . " block is shown in FIG. 5. In the non-binary case, the decoder is more complex, since it has to find the inverse of "C=C+D", without knowing the value of s.
The above citations and those listed on page 8 are hereby incorporated herein in whole by reference.