MPEG-4 will be described below as an example. Generally, in an image encoding method represented by MPEG-4, input image signal data is compressed utilizing spatial and chronological correlations. The data obtained utilizing the spatial and chronological compression is variable-length encoded in a given sequence to generate a bit stream.
In MPEG-4, a whole display image (composite image) includes images (objects) of plural image series, so that an image plane of each image series at each display time is referred to as a video object plane (hereafter referred to as a “VOP”) and is distinguished from a frame of MPEG-1 or MPEG-2. If a whole display image is formed from images of a single image series, the VOP agrees with the frame.
A VOP has a luminance signal and a color-difference signal, and is composed of a plurality of macroblocks. A macroblock includes a 16 by 16 matrix of pixels for the luminance signal. In the image encoding by MPEG-4, the amount of information is compressed by the spatial compression, chronological compression, and other schemes in a unit of each of macroblocks. The spatial compression is performed by converting the signals from the time domain to the frequency domain using a discrete cosine transform (hereafter referred to as a “DCT”), which is a type of orthogonal transformation, and then quantizing the converted signal. The chronological compression uses motion compensation.
Further, there are two methods of data compression in a unit of each of the VOPs: Spatial intra-picture coding (hereafter referred to as “intra-coding”) encodes a VOP using only spatial compression in the same picture; and inter-picture coding (hereafter referred to as “inter-coding”) encodes a VOP using chronological compression using the correlation of pictures.
An image encoding apparatus must output a bit stream of a designated code amount in accordance with a given encoding parameter. The image encoding apparatus must also control the amount of code generation in accordance with the estimated amount of occupation in a buffer (a virtual buffer verifier, hereafter referred to as a “VBV buffer”) of the decoding apparatus for receiving a bit stream, so that the VBV buffer will not overflow nor underflow.
The amount of code generation is controlled in accordance with a quantization parameter, which is used to quantize a DCT coefficient set for each macroblock contained in a VOP. Therefore, the amount of code generation is controlled in a unit of each of the VOPs. Generally, as the quantization parameter increases, the amount of code generation decreases; and as the quantization parameter decreases, the amount of code generation increases. That is, the amount of code generation and the quantization parameter are in inverse proportion. Through the use of this property, the amount of code generation can be changed.
However, since the possible range of the quantization parameter is limited, it may be difficult to control the amount of code generation appropriately just in accordance with the quantization parameter, in some cases. So, if the amount of code generation is greater than a target value, not all the VOPs are subjected to the encoding processing, and a VOP which is not encoded is generated, that is, a skip VOP, the encoding processing of which is skipped, is generated, thereby suppressing the total amount of code generation. On the other hand, if the amount of code generation is smaller than the target value, processing is performed to insert a redundant bit into a bit stream, thereby increasing the amount of code generation. The above-mentioned technique for suppressing the amount of code generation by skipping part of the VOP encoding processing is described in document 1 (Japanese Patent Kokai (Laid-Open) Publication No. 2002-262297, pages 4 to 7 and FIG. 3), for instance.
Further, document 2 (Japanese Patent Kokai (Laid-Open) Publication No. H6-54319, pages 4 to 5 and FIG. 2) describes a code amount control method for detecting any scene change from an input signal and assigning a large code amount to an image immediately after the scene change so that image degradation at a scene change can be reduced in an apparatus for performing the encoding processing of an input image signal, for instance.
An image encoding apparatus for performing the encoding processing in accordance with the conventional MPEG-4 performs an encoding method for controlling the amount of code generation without performing the encoding processing for some VOPs, as described in document 1. Therefore, if a skip VOP, the encoding processing of which is skipped, is a VOP where a scene change is detected, the following problem arises.
Suppose that there are chronologically successive VOPs (respectively denoted as “VOP1”, “VOP2”, and “VOP3” in chronological order) and that a scene change is detected in VOP2, for instance. A greater code amount than usual must be assigned to VOP2. Otherwise, the image quality will be degraded. However, if VOP2 happens to be a skip VOP in order to suppress the total amount of code generation, the encoding processing of VOP2 would not be performed. When the encoding processing of VOP3 is performed, information indicating that VOP2 has a scene change has been lost, and a normal code amount is assigned to the encoding processing of VOP3. This could degrade the image quality of VOP3.
Further, if the encoding method as described in document 2 is performed to avoid image degradation by detecting a scene change and assigning a greater code amount to an image immediately after the scene change, the following problem arises. When a scene change is detected in successive VOPs, a great code amount is assigned successively. This can degrade the image quality of a part other than the scene change or can cause a drop frame or the like. Therefore, an appropriate code amount cannot be assigned as a whole.