1. Field of the Invention
The present invention relates to an encoding method, a decoding method, an encoding/decoding method, and an apparatus installing these methods for implementing an efficient encoding/decoding of information source data and generated code data.
2. Description of the Related Art
Related Art 1.
As a known technique for effectively encoding (compress) information source data, an arithmetic coding has been recently adopted by International Standard Encoding System (JBIG (Joint Bi-level ImageExperts Group) and JPEG (Joint Photographic Experts Group)). An example of the International Standard Arithmetic Coding System is a QM-Coder which is discussed in ITU-T Recommendation T.82 (JBIG) and T.81 (JPEG).
The principle of the arithmetic coding is that a range of equal or greater than 0.0 . . . 0 and less than 1.0 . . . 0 is reflexively divided into subintervals with regularity based on occurrence probability of individual data and a value of a fraction within the sub-interval corresponding to a sequence of occurring data to be encoded is output as a code. In the arithmetic decoding, a reflexive division of the interval is performed with the same regularity as the coding, and a sequence corresponding to the sub-interval including the code value is output as decoded data. Hereinafter, the data to be encoded is assumed to be binary (0 and 1).
In an encoding method adopting prediction, the data to be encoded is not a data value itself, but is binary symbol which shows a match/mismatch of the prediction value. Hereinafter, a symbol showing a match is referred to as an MPS (More Probable Symbol) (value 0), and a symbol showing a mismatch is referred to as an LPS (Less Probable Symbol) (value 1). The data value which is more probable to occur is learned and held as a prediction value. Accordingly, the MPS occurs with a probability of equal to or greater than 0.5, and the LPS occurs with a probability of equal to or less than 0.5.
A concept of interval division of the arithmetic coding is shown in FIG. 23.
A procedure for the arithmetic coding of a certain symbol to be encoded can be shown as FIG. 24, where a size of a current interval is A, a lower limit value of the current interval is C, occurrence probability of a symbol value 1 is P, sub-interval sizes for the symbol values 0, 1 are A0, A1, and a symbol value which actually occurs is X. Initial values for A and C are 1.0 and 0.0, respectively.
For the sub-intervals A0 and A1, the encoding operation become different according to which sub-interval is placed at upper/lower part of the interval. At S1003 through S1005 in the figure, the operation to the upper sub-interval Ah, the lower sub-interval Al, and the symbol Sl corresponding to the lower sub-interval are separately shown for two cases; the left side shows in case of placing the symbol value 0 at the lower part; and the right side shows in case of placing the symbol value 1 at the lower part. At S1006, if the symbol X is the symbol 1 corresponding to the lower sub-interval, the interval A is updated to the lower sub-interval Al at S1007. On the other hand, if the symbol X is not the symbol 1, the interval A is updated to the upper sub-interval Ah at S1008, and at the same time the lower sub-interval size Al is added to the code C which shows the lower interval limit at S1009.
Similarly, in the arithmetic decoding procedure, the code C is updated to the displacement from the lower limit value of the current interval, and if X is the decoding symbol value, the operation can be shown as FIG. 25. The initial value of A is set to 1.0, and the initial value of C is set to the code value obtained by the encoding process.
For the sub-intervals A0 and A1, the decoding operation become different according to which sub-interval is placed at upper/lower part of the interval. At S1103 through S1106 in the figure, the operation to the upper sub-interval Ah, the lower sub-interval Al, the symbol Sh corresponding to the upper sub-interval, and the symbol Sl corresponding to the lower subinterval are separately shown for two cases; the left side shows in case of placing the symbol value 0 at the lower part; and the right side shows in case of placing the symbol value 1 at the lower part. At S1107, if the code value C is less than the lower sub-interval Al, the interval A is updated to the lower sub-interval Al at S1108, and the decoding symbol X is updated to the symbol S1 corresponding to the lower sub-interval at S1109. On the other hand, if the code value C is not less than the lower sub-interval Al, the interval A is updated to the upper sub-interval Ah at S1110, and at the same time the lower sub-interval size Al is subtracted from the code C which shows the lower interval limit at S1111, and the decoding symbol X is updated to the symbol Sh corresponding to the upper sub-interval at S1112.
The decoded data value becomes the same value as the prediction if the decoded symbol value is 0, and becomes a different value (1-prediction value) from the prediction if the decoded symbol value is 1.
In the above interval divisional rule, the information theory can prove that it is the most efficient encoding to divide the interval in proportion with the occurrence probability. The procedure described above is called as multiplication-based arithmetic coding, and the precision for fractional representation is said to be infinite.
The arithmetic coding is performed by binary fraction operation, and the precision for fractional representation, namely, the effective digits of the interval limit value is increased during the process, which makes the implementation of the arithmetic coding difficult. However, the subtraction-based arithmetic coding enables the arithmetic coding to be practiced, in which the high order digits of the fraction whose value does not change during the operation and which is located close to the binary point is truncated from the operation, and a fixed number of effective digits is guaranteed in the operation. According to the subtraction-based arithmetic coding, the multiplication value Al, which is obtained by the multiplication of the interval size A of the multiplication-based arithmetic coding and the occurrence probability value P of LPS, is replaced by an approximate value LSZ of the probability P, the updated interval size A is renormalized, that is, extended by multiplying power of 2 so as to be always equal to or greater than 0.5, and the coding is implemented by shift operation. As for the approximate value LSZ, a suitable value is selected and adopted according to the probability value P from some candidates, and the approximate value LSZ is assigned without referring to the whole interval size A (equal to or greater than 0.5 and equal to or less than 1.0).
FIG. 26 shows a concept of the interval division according to the subtraction-based arithmetic coding.
FIGS. 27 and 28 show the procedure of the subtraction-based arithmetic coding and the subtraction-based arithmetic decoding. At S1201 of the encoding process and S1301 of the decoding process, the approximate value LSZ is assigned to the sub-interval A1 of the symbol value 1.
After the interval size is updated, the interval size A is renormalized, that is, extended by multiplying power of 2 at S1208 and S1211 of the encoding process and S1310 and S1314 of the decoding process, so that the interval size A is always kept equal to or greater than 0.5. If the interval size A which is updated by processing the MPS is equal to or greater than 0.5, the extension for the renormalization is not necessary to be implemented.
While the interval size A is kept equal to or greater than 0.5 and less than 1.0 by the renormalization, since the approximate value LSZ is adopted, the divisional ratio of the sub-intervals A0 and A1 has an error compared with the actual ratio of the probability. For example, when the approximate value LSZ=0.3, in case of the maximum interval size A=1.0 (only initial value) and the minimum value A=0.5, a larger interval than the MPS may be assigned to the LPS of the occurrence probability being equal to or less than 0.5 so that the divisional ratio of the sub-interval Al becomes 0.3, 0.6 as shown in FIG. 29. “Conditional MPS/LPS exchange,” in which the MPS is assigned to the LSZ when the LSZ becomes larger than (A-LSZ) such as the above described case, disclosed in the Japanese Patent No. 2128115 (corresponding to the Japanese Examined Patent Publication JP1996-34434) is adopted to ITU-T Recommendation T.82 (JBIG) and T.81 (JPEG), and so on.
When “conditional MPS/LPS exchange” is adopted, the procedure of the subtraction-based arithmetic coding and the subtraction-based arithmetic decoding can be shown as FIGS. 30 and 31. In this case, the subinterval size Al of the symbol value 0 is assumed to be an approximate value LSZ, however, according to the relationship of the size with the sub-interval size A0, finally the sub-interval A0 can be corresponded to the symbol 1, and also the sub-interval A1 can be corresponded to the symbol 0.
In the coding procedure of FIG. 30, from S1406 through S1411 are processes for X=0 (MPS) to update the interval A into a larger sub-interval between the sub-intervals Al and Ah. S1412 through S1417 are processes for X=1 (LPS) to update the interval A into a smaller sub-interval between the sub-intervals Al and Ah.
In the decoding procedure of FIG. 31, S1506 through S1412 are processes for updating the interval A to the lower sub-interval Al, and S1513 through S1521 are processes for updating the interval A to the upper subinterval Ah. If the updated sub-interval is the larger sub-interval, the decoding symbol becomes X=0 (MPS), and if the updated sub-interval is the smaller sub-interval, the decoding symbol becomes X=1 (LPS).
Further, according to the subtraction-based arithmetic coding and the subtraction-based arithmetic decoding, as an example of a method other than the above “conditional MPS/LPS exchange,” the Japanese Patent No. JP-2128110 (corresponding to the Japanese Examined Patent Publication No. JP1996-34432) describes a correction method of the sub-interval, in which if the sub-interval size A0 becomes less than 0.5, the sub-interval is corrected to an average value of A0 and 0.5, and the sub-interval size Al should be also corrected.
These correction methods are applied when the interval A is divided into two sub-intervals A0 and A1.
Related Art 2.
In the arithmetic coding, the final code value can be treated as an arbitrary coordinate within the final interval. This is based on a rule to continue decoding by supplying a specific bit pattern of end bits which is deleted to shorten the code length when a code lacks on decoding. The deleted specific bit pattern of end bits is treated by, for example, byte unit, and usually one of byte 0x00 and 0xFF. In the above-mentioned International Standard Encoding Recommendation T.82, a repetition of byte 0x00 is applied.
Here, it is impossible to detect the end of decoding based on the code length as discussed above, and it is necessary to previously notify the number of symbols or lines to be decoded as long as the data to be decoded has not a fixed length. If a previous notification cannot be done, the notification should be made by inserting a marker segment, etc. into the code before the end of decoding.
The marker segment includes an escape byte (0xFF), an identification byte, and its additional information, if it is required. Here, in order that the same value as the marker segment should not occur within the code on encoding, it is prevented to generated the identification byte by inserting a bit or a byte for control directly after the escape byte when the escape byte occurs within the code. The bit or byte for control is deleted on decoding to obtain the original code value.
According to “Text compression” (Text compression/Timothy C. Bell, John G. Cleary, Ian H. Witten., 1990), another symbol is employed to show an end exclusively to always keep a sub-interval of the minimum interval size, and the symbol is encoded on completion of encoding, and a coordinate within the sub-interval is selected as a code to notify the end of decoding.
To encode the end symbol means to encode the final symbol, and symbols prior to the final symbol, the sub-interval for the end symbol for each encoding of the symbol is always truncated.