1. Field of the Invention
The present invention relates to a method and apparatus for arithmetic coding, a method and apparatus for arithmetic decoding, and a storage medium intended for reducing the quantity of data transferred over a transmission line and the quantity of data stored in the storage medium.
2. Description of the Related Art
Given a string of input symbols and their probability of occurrence, an arithmetic coding, one of available coding methods, codes an input symbol. An arithmetic code is the code of a coordinate at a point included in a segment interval that is finally obtained by executing recursively to an input symbol the process in which a base interval as a real number interval is segmented into segment intervals, each proportional to the probability of occurrence of each input symbol, with the input symbol associated with the corresponding segment interval.
The principle of arithmetic coding is now discussed referring to FIG. 6.
The base interval typically used in arithmetic coding is an interval [0, 1). The symbol "[" here means that the value following it is included in the range of the interval and the symbol ")" means that the value preceding it is not included in the range of the interval. The interval [0, 1) represents the interval in which a real number satisfying the condition of 0.ltoreq..times.&lt;1 is present. The base interval is shown in the left portion of FIG. 6. The base interval is segmented into segment intervals proportionally to the probability of occurrence of an input symbol, and any of the segment intervals is selected in response to the input signal.
Let Pa and Pb represent probabilities of input symbols a and b when input symbols are two, a and b, for example (Pa+Pb=1). The base interval [0, 1) is segmented into a segment interval [0, Pa) for the input symbol a and a segment interval [Pa, 1) for the input symbol b. One of the two intervals is selected depending on the input symbol. Now the interval [0, Pa) is selected assuming that the input symbol is a.
Thereafter, the selected segment interval is segmented proportionally to the probability of occurrence of the input symbol, and one of the segment intervals is selected. This process is repeated. For example, FIG. 6 shows the case that the input symbol a is followed by input symbols b and a. The segment interval in which the subsequent symbol b is processed is shown in the central portion of FIG. 6 and the segment interval in which the final symbol a is processed is shown in the right portion of FIG. 6. More particularly, the subsequent input symbol b selects the interval [Pa.times.Pa, Pa) and the final input symbol a selects the interval [Pa.times.Pa, Pa.times.Pa+(Pa-Pa.times.Pa).times.Pa).
When the probabilities of occurrence Pa=1/3 and Pb=2/3, the finally selected segment interval is [1/9, 5/27). In arithmetic coding, the coordinate of a point included in this segment interval, expressed in binary fraction, is transmitted. Since the binary expression of the segment interval is [0.000111 . . . , 0.001011 . . .) in this case, the output code is 00100. Among real numbers equal to or greater than 0.000111 but equal to or smaller than 0.001011, one fraction having the minimum number of fractional digits is output as a code. In the above example, 0.000111.ltoreq.0.00100.ltoreq.0.001011.
The principle of arithmetic coding has been discussed. As a string of input symbols gets longer, a large number of significant figures are required and handling them is impossible in practice. The following technique has been conventionally used.
As shown in FIG. 7, when a selected interval [x, y) is included in an interval [0, 1/2), namely when 0.ltoreq.x and y.ltoreq.1/2, the initially output code is 0 no matter what input symbols come in next. The output of 0 is followed by the updating of the interval from [x, y) to [2.times.x, 2.times.y).
When the selected interval [x, y) is included in [1/2, 1) as shown in FIG. 8, namely, when 1/2.ltoreq.x and y.ltoreq.1, the initially output code is 1 no matter what input symbol comes in next. The output of 1 is followed by the updating of the interval from [x, y) to [2.times.(x-1/2), 2.times.(y-1/2)).
When the selected interval [x, y) is included in [1/4, 3/4) as shown in FIG. 9, namely, when 1/4.ltoreq.x and y.ltoreq.3/4, the output code is provided in the method to be described below. Although the output code to be provided is yet to be determined, as apparent from FIG. 9, if 0 is output, it is necessarily followed by the output code of 1, and if 1 is output, it is necessarily followed by the output code of 0. The interval is thus updated from [x, y) to [2.times.(x-1/4), 2.times.(y-1/4)) on condition that a next output is followed by a subsequent output code opposite to the preceding output.
In the actual arithmetic coding, each time the coding interval is segmented by each input symbol, a determination is made of whether the resulting coding interval is included in one of the three intervals, and as long as the resulting code is included in one of the three intervals, the update process is repeatedly applied. This process is called a resealing.
As for the final input symbol, the coding interval subsequent to segmentation of coding interval and resealing necessarily includes one of the intervals [0, 1/4), [1/4, 1/2), [1/2, 3/4), and [3/4, 1). If neither of them is included, resealing has to be applied. In response to each case, output codes 00, 01, 10 and 11 are respectively output. This process is called flushing.
Referring to a flow diagram shown in FIG. 10, the process of flushing is now discussed. It is determined in step S1 whether the coding interval [x, y) subsequent to coding interval segmentation and resealing satisfies the condition of x=0 and 1/4.ltoreq.y. When it is determined that the coding interval [x, y) satisfies the condition x=0 and 1/4.ltoreq.y, the process goes to step S2 to output the code 00 and ends.
When it is determined in step S1 that the coding interval [x, y) fails to satisfy the condition x=0 and 1/4.ltoreq.y, the process goes to step S3 to determine whether the coding interval [x, y) subsequent to coding interval segmentation and resealing satisfies the condition of x.ltoreq.1/4 and 1/2.ltoreq.y. When it is determined that the coding interval [x, y) satisfies the condition x.ltoreq.1/4 and 1/2.ltoreq.y, the process goes to step S4 to output the code 01 and ends. On the other hand, when it is determined that the coding interval [x, y) fails to satisfy the condition x.ltoreq.1/4 and 1/2.ltoreq.y, the process goes to step S5.
It is determined in step S5 whether the coding interval [x, y) satisfies the condition of x.ltoreq.1/2 and 3/4.ltoreq.y. When it is determined that the coding interval [x, y) satisfies the condition x.ltoreq.1/2 and 3/4.ltoreq.y, the process goes to step S6 to output the code 10 and ends. On the other hand, when it is determined that the coding interval [x, y) fails to satisfy the condition x.ltoreq.1/2 and 3/4.ltoreq.y, the process goes to step S7 to output the code 11 and ends.
The flushing process is performed subsequent to the coding of the final input symbol in principle. In case of a system that switches back and forth between arithmetic coding and FLC (Fixed Length Code) or VLC (Variable Length Code), the flushing process is performed after an input symbol immediately prior to a switching is coded when switching over to one of two codes from an arithmetic code.
In the arithmetic coding, a determination is made of which one of the segment intervals of the base interval an input code corresponds to and the symbol corresponding to the interval is output. To determine which segment interval the input code corresponds to, a prereading is typically performed. When all symbols are decoded, a process for recovering preread codes is required. This process is called a reset.
FIG. 11 is a flow chart showing the reset process. In step S11, the preread codes are recovered by fixed-number bits. The value of the fixed number is a bit number that is preread for the selection of one among the segment intervals.
The arithmetic coding advantageously requires only the code amount corresponding to the entropy of the input symbols. If the flushing process is involved, however, the arithmetic codes end with extra bits.
To recover errors in codes during coding, a synchronization code called FLC is inserted. Inserting an FLC requires that the arithmetic coding apparatus be flushed, lowering the efficiency of coding. To increase a resistance to error, the number of flushes has to be increased. The coding efficiency is even further lowered.
VLC is sometimes combined with the arithmetic coding. Such a usage also requires that the arithmetic coding apparatus be flushed, lowering the coding efficiency.