A compression encoding technology has been conventionally used in order to transfer and accumulate image data efficiently. Particularly, the systems of MPEG 1 through 4 and of H.261 through H.264 are widely used for moving-image data. When encoding a moving image, in order to realize reduction of data amount, there is sometimes adopted a method of producing a prediction signal of a target image to be encoded, by means of other images that are adjacent to each other on the time base, encoding the difference between the target image and the prediction signal (see, for example, the following Patent literature 1). This method is called “inter-frame encoding.”
In H.264, for example, the encoding device divides one frame of image into block areas, each of which is composed of 16×16 pixels, and performs encoding processing of the image in units of blocks. In inter-frame encoding, motion prediction is performed on a target block of an image to be encoded, on the basis of a reference image of another frame that has been encoded and then decoded, whereby a prediction signal is produced. Next, the difference value between the target block and the prediction signal is obtained, and discrete cosine transformation and quantization processing are performed to produce encoded data.
On the other hand, the quantized conversion coefficient is inversely quantized and thereafter inverted, as a result of which a reproduced conversion coefficient is generated. Then, the prediction signal is added to the reproduced conversion coefficient, and a reproduced image is decoded. The decoded reproduced image is stored temporarily as a reference image to be used for encoding and decoding the next image.
Also, the following technologies can be included as another background art. A moving image is constituted by continuously arranging a “frame” which is a single static image. The magnitude of the amplitude of spatial frequency components (called “spatial frequency amplitude” hereinafter) indicates the contrast of the image, and thus is involved in evaluation of the quality of the moving image.
If a major change occurs in the contrast of the image in a short period of time throughout the moving image, blinking might occur. Also, since the human eye has a sensitivity to the contrast, the contrast is considered important in evaluation of the quality of the moving image.
In some moving images, generally, the image contrast changes as time advances. The best example of such moving images or the like is that the image contrast gradually becomes clear or blurry when taking a long time to change the scenes.
Even in the case of an image in which the contrast of each frame is low, when the flames are displayed as a moving image in which the frames are arranged continuously, a person who views this moving image might experience an optical illusion phenomenon where the moving image appears as a clear moving image having the contrast higher than that of a static image, as described in non-patent literature 1. This optical illusion phenomenon is called “motion sharpening phenomenon.”
The above non-patent literature 1 explains the experimental result in which, according to this motion sharpening phenomenon, even when filters were periodically used in the original image to insert frames in which spatial frequency bands or contrasts were changed, the moving image was perceived to have a high quality as a result of evaluation of the moving image with the original image when the original image was viewed as the moving image.
On the other hand, a compression encoding technology is used in order to transfer and accumulate moving-image data efficiently. The systems of MPEG 1 through 4 and of ITU (International Telecommunication Union) H.261 through H.264 are widely used for moving images. When encoding a moving image, a prediction signal of a target image to be encoded is produced by means of other images that are adjacent to each other on the time base, and the difference between the target image and the prediction signal is encoded to thereby realize reduction of data amount. This method is called “inter-frame encoding.”
The encoding device for realizing the processing defined by the ITU H.264 divides one frame of image into block areas, each of which is composed of 16×16 pixels, and performs encoding processing on the image in units of blocks. In inter-frame prediction encoding, this encoding device performs motion prediction on a target block of an image to be encoded, on the basis of a reference image of another frame that has been encoded and then decoded, and thereby produces a prediction signal. Then, the encoding device obtains the difference value between the target block and the prediction signal, performs discrete cosine transformation and quantization processing on this difference value, and produces encoded data as a quantized conversion coefficient, on the basis of this difference value.
Thereafter, the encoding device performs inverse quantization and then inversion on the quantized conversion coefficient to generate a reproduced conversion coefficient (difference value). Then, the encoding device adds the prediction signal to the reproduced conversion coefficient to decode a reproduced image. The decoded reproduced image is stored temporarily as a reference image to be used for encoding and decoding the next image.
In such moving image compression encoding, with regard to a moving image in which the spatial frequency amplitude of each image is generally low and the contrast is blurry, the conversion coefficient (difference value) is small, thus the amount of data to be encoded can be reduced. For this reason, when encoding a moving image that contains the image having blurry contrast or the low-contrast image in which the motion sharpening effect is expected, high encoding efficiency is expected.    [Patent Literature 1] Japanese Patent Laid-open No. H10-136371    [Non-Patent Literature 1] Tekeuchi, T. & De Valois, K. K. (2005) Sharpening image motion based on spatio-temporal characteristics of human vision. (San Jose, USA), URL: http://www.brl.ntt.co.jp/people/takeuchi/takeuchi-EI2005.pdf, <searched on Jun. 2, 2005>
However, in the conventional image encoding/decoding technology described above, a moving image cannot be compressed efficiently if images having different signal bands exist in the moving image]. For example, a moving image in which images having different signal bands exist is sometimes produced in image capturing performed by a consumer video camera. This is because the bands of adjacent images fluctuate because the focus is automatically adjusted by the auto-focusing function of the camera during image capturing, and thereby an image having a wide signal bandwidth and an image having a narrow signal bandwidth are recorded adjacent to each other.
In this case, when the encoding device predicts a first image having narrow signal bandwidth with reference to a second image having wide signal bandwidth, high-frequency components contained in the second image are contained in a differential signal of the first image. Therefore, there is a problem that the prediction signal becomes a signal having a band wider than that of the first image, increasing the amount of information and reducing the compression rate.
Moreover, another problem is that, in the conventional moving image encoding/decoding method, a moving image cannot be compressed efficiently if it has images having very different contrasts, i.e., the spatial frequency amplitudes. When a first image having low spatial frequency amplitudes is predicted with reference to a second image having high spatial frequency amplitudes, a target to be predicted cannot be searched well, or the difference between spatial frequency amplitudes contained in the second image will be contained in differential signals of the first image, whereby the amount of information is increased and the compression rate is reduced. Also, when a third image having high spatial frequency amplitudes is predicted with reference to the first image having low spatial frequency amplitudes, similarly, a target to be predicted cannot be searched well, or the difference between spatial frequency amplitudes contained in the third image will be required as a differential signal, thus, in this case as well, the problem is that the amount of information is increased and the compression rate is reduced.