In recent years, formats such as JPEG for still images and MPEG for moving images have been standardized as techniques for compressing and decompressing pictures due to efforts toward creating international standards for image encoding schemes.
The MPEG (Moving Picture Experts Group) encoding scheme is primarily composed of a motion compensation inter-frame prediction unit, a DCT (discrete cosine transform) unit, and a variable-length encoding unit. The motion compensation inter-frame prediction unit detects motion vectors from inputted picture data and earlier picture data, and creates residual error data from the motion vectors and the earlier picture data. The DCT unit performs DCT transformations on the residual error data. A quantization unit quantizes DCT coefficients, and the variable-length encoding unit assigns code words to the quantized DCT coefficients and motion vectors.
The encoded image data in the MPEG encoding scheme has a hierarchical structure of six layers: sequence, GOP (Group Of Picture), picture, slice, macroblock, and block. A picture is the basic encoding unit that corresponds to a single picture, and is composed of a plurality of slices. A slice is a synchronization recovery unit, a band-shaped area composed of one or a plurality of macroblocks.
Variable-length encoding refers to one kind of entropy encoding. As there is variation in the probability of values such as post-DCT transformation coefficients (DCT coefficients) and motion vector values, variable-length encoding reduces the average amount of data by assigning short code words to those values that have a high probability, and assigning long code words to those values that have a low probability.
The main types of variable-length encoding include Huffman encoding and arithmetic encoding.
Huffman encoding is a method in which code words are determined by a Huffman code tree in which each symbol is a leaf. Huffman encoding uses a correspondence table (code table) that includes code words (bit strings) for each code.
To improve the compression ratio, Huffman encoding uses methods such as a method in which a code table is created that corresponds to statistical properties of the changing moving image, and a method in which a plurality of code tables are prepared and code tables are switched in response to statistical properties of the pictures. Information theory establishes that a code table in which log2 (1/p) bits are assigned to the codes of a probability p has the smallest average volume of data. That is why, in the method of switching a plurality of code tables, the probability is calculated from encoded data, and a code table is selected so that bit numbers close to log2 (1/p) bits are assigned to the codes of the probability p.
Arithmetic encoding is a technique in which the sequence of symbols is projected to intervals [0, 1] in response to the probability, and a probability space on a number line is expressed as an appropriate binary number within that interval. In arithmetic encoding, encoding is performed while constantly monitoring statistical properties. Specifically, probability tables are rewritten in response to the contents of the pictures, and code words are determined while referencing the probability tables. More specifically, in arithmetic encoding, the probability used in arithmetic operations is successively updated by encoded data so that log2 (1/p) bits are assigned to a code of the probability p.
Unlike Huffman encoding, in arithmetic encoding, bit strings corresponding to code words can be obtained with only arithmetic operations (addition, subtraction, multiplication, and division), and therefore, the amount of memory required to store the code table can be reduced as compared to Huffman encoding. Furthermore, it is possible to respond to changes in statistical properties during encoding by rewriting the probability table. However, arithmetic operations, in particular multiplication and division operations, require great arithmetic capacity; thus one drawback is that it is difficult to effectuate arithmetic operations in devices with low arithmetic capacity.
In the above-described adaptive encoding methods, compression efficiency can be improved as compared to fixed encoding methods, because the encoding method continues to be dynamically optimized with encoded data.
However, the following problems occur when dynamically optimizing the encoding method with encoded data.
Learning-based dynamic encoding methods are performed, for example, on picture data after the header, that is, on each slice, macroblock, or block. In this case, arithmetic encoding uses a fixed probability table for the initial values for each sub-unit for encoding in each picture, and Huffman encoding uses a fixed variable-length code table as an initial code table in each picture. As fixed initial values are used in this way, the encoding compression efficiency cannot be considered favorable until optimal probability tables and code tables are obtained with learning after initialization. In particular, when the total amount of data is small, the proportion of data required for learning increases, and the compression ratio is not that high.
On the other hand, when a portion of the encoded data used in learning is lost in the transmission line, proper learning cannot be performed in the decoding device, and decoding becomes impossible. Further, in the case of image data, picture quality deterioration occurs due to transmission errors. Although regularly resetting the results of the learning protects against transmission errors, this protection is vulnerable to error when the reset interval is long and thus it is unavoidable that the reset interval will be short to a certain extent.
Unless the above-described problem of transmission error is solved, the compression efficiency of current adaptive encoding methods will not improve sufficiently.