Recently, contents flowing through networks have increased in volume and diversity from character information to still image information, further to moving image information. Accordingly, coding techniques for compressing information amounts have been developed. The developed coding techniques Save been standardized internationally and widely spread.
On the other hand, networks themselves have increased capacity and diversity, and hence one content from the transmitting side passes through various environments until it reaches the receiving side. In addition, transmitting/receiving-side devices have diversified in terms of processing performance. General-purpose information processing apparatuses such as personal computers (to be referred to as PCs hereinafter) mainly used as transmission/reception devices have exhibited great improvements in performance such as CPU performance and graphics performance. On the other hand, devices with different types of processing performance, e.g., a PDA, cell phone, TV set, and hard disk recorder, have been equipped with network connection functions. Under the circumstances, a great deal of attention has been paid to a function called scalability, which can cope with changing communication line capacities and the processing performance of receiving-side devices with one data.
The JPEG 2000 coding scheme is widely known as a still image coding scheme having this scalability function. For example, this scheme is disclosed in ISO/IEC 15444-1 (Information technology—JPEG 2000 image coding system—Part 1: Core coding system).
A characteristic feature of this scheme is that DWT (Discrete Wavelet Transformation) is performed for input image data to discretize the data into a plurality of frequency bands. The coefficients of the respective frequency bands are quantized, and the quantized values are arithmetically coded for each bitplane. This scheme allows fine control of layers by coding or decoding only a necessary number of bitplanes.
In addition, the JPEG 2000 coding scheme has realized a so-called ROI (Region Of Interest) technique of relatively improving the image quality of a region of interest in an image, which does not exist in the conventional coding techniques.
FIG. 20 shows a coding sequence in the JPEG 2000 coding scheme. A tile segmentation unit 9001 segments an input image into a plurality of regions (tiles) (this function is optional, and input image=1 tile may be set). A DWT unit 9002 performs discrete wavelet transformation to discretize data into frequency bands. A quantization unit 9003 quantizes each coefficient (this function is optional). An ROI unit 9007 (option) sets a region of interest. The quantization unit 9003 performs shift up. An entropy coding unit 9004 performs entropy coding by the EBCOT (Embedded Block Coding with Optimized Truncation) scheme. A bitplane round-down unit 9005 performs rate control by rounding down the code data of lower bitplanes of the coded data, as needed. A code forming unit 9006 adds header information to the data and selects various types of scalability functions, thereby outputting code data.
FIG. 21 shows a decoding sequence in the JPEG 2000 coding scheme. A code analyzing unit 9020 analyzes a header to obtain information for forming layers. A bitplane round-down unit 9021 rounds down lower bitplanes of the input code data in accordance with the capacity of an internal buffer and the decoding capability. An entropy decoding unit 9022 decodes the code data based on the EBCOT coding scheme to obtain quantized wavelet transformation coefficients. A dequantization unit 9023 dequantizes the coefficients. An inverse DWT unit 9024 performs inverse discrete wavelet transformation for the dequantized coefficients to reproduce image data. A tile combining unit 9025 combines a plurality of tiles to reproduce image data (when 1 frame=1 tile, no combining operation is required).
The above JPEG 2000 coding technique is mainly suitable for still image coding. A reference: ISO/IEC 15444-3 (Information technology—JPEG 2000 image coding system Part 3: Motion JPEG 2000)) has also recommended Motion JPEG 2000, which is a technique of coding a moving image by making the above technique correspond to each frame of the moving image. According to this scheme, it is impossible to reduce the code amount.
The MPEG coding method is known as a technique directed from the beginning to moving image compression. For example, this method is disclosed in “Latest MPEG Textbook” (ASCII Publishing, p. 76, 1994). In this coding technique, motion compensation is performed between frames to improve the coding efficiency (non-patent reference 3). FIG. 22 shows an arrangement for this coding operation. A block segmentation unit 9031 segments an input image into 8×8 pixel blocks. A differential unit 9032 subtracts predicted data from the resultant data by motion compensation. A DCT unit 9033 performs discrete cosine transformation. A quantization unit 9034 performs quantization. The resultant data is coded by an entropy coding unit 9035. A code forming unit 9036 adds header information to the resultant data to output code data.
At the same time, a dequantization unit 9037 dequantizes the data. An inverse DCT unit 9038 performs inverse transformation to discrete cosine transformation. An addition unit 9039 adds predicted data to the resultant data and stores it in a frame memory 9040. A motion compensation unit 9041 obtains a motion vector by referring to the input image and a reference frame stored in the frame memory 9040, thereby generating predicted data.
Consider a case wherein the scheme of coding bitplanes so as to realize scalability as in JPEG 2000 is applied to the above MPEG coding technique. In this case, when the information of a portion which has been referred to for motion compensation is lost by abortion of bitplane coding, errors due to motion compensation are accumulated, resulting in a deterioration in image quality. This point will be described in detail below.
When the JPEG 2000 technique is to be applied to the MPEG technique, the DCT unit 9033 and inverse DCT unit 9038 in FIG. 22 are replaced with a discrete wavelet transformation unit and inverse discrete wavelet transformation unit, and the entropy coding unit 9035 performs coding for each bitplane.
In an apparatus on the coding side, a target frame which is an input frame to be coded is defined as Fi, and a frame input immediately before the target frame is defined as Fi-1. In this case, if an quantization error in the quantization unit 9034 is ignored, a frame image immediately before being stored in the frame memory 9040 is identical to Fi-1. Therefore, no problem arises in terms of coding errors due to motion compensation on the coding side.
Consider the decoding side. A decoding apparatus is opposite in function to a coding apparatus, and hence obviously includes a frame memory which stores the frame Fi-1 immediately before being referred to for motion compensation when the target frame Fi is decoded. Consider a case wherein decoding is performed by using the scalability of JPEG 2000 without using the code data of a given bitplane.
In this case, letting F′i-1 be the image of a frame immediately before being stored in the frame memory of the decoding apparatus, the image F′i-1 does not become identical to the frame image Fi-1. This is because the code data of a bitplane which is not used exists in decoding processing.
The code data of each block in the target frame is code data based on the difference from a block at the position indicated by a motion vector in the immediately preceding frame Fi-1. Since the immediately preceding frame image is not Fi-1 but is F′i-1 in spite of this fact, the target frame Fi cannot be properly decoded either. Obviously, as a consequence, a similar problem arises in each frame following the target frame, and hence errors are gradually accumulated, resulting in an image which is far from the intended image.