A conventional technique will be described with reference to the accompanying drawings.
Hybrid moving image encoding as a conventional technique based on an orthogonal transformation device and a prediction device (intra-frame prediction/inter-frame prediction) will be described below with reference to FIG. 1.
According to the conventional technique, an image frame forming a moving image is divided into a plurality of areas called macroblocks (MBs), and each block obtained by further dividing each MB is encoded. FIG. 2 shows the arrangement of an AVC (Advanced Video Coding: ISO/IEC 14496-10) image frame as a concrete example of the arrangement of an image frame according to the conventional technique.
A prediction value supplied from an intra-frame prediction device 5108 which performs prediction from inside the same image frame reconstructed in the past or an inter-frame prediction device 5109 which performs prediction from a past image frame reconstructed in the past is subtracted from the above MB. The MB signal from which the prediction value has been subtracted is called a prediction error signal.
The above prediction error signal is divided into smaller blocks (to be simply referred to as blocks hereinafter), and each block is transformed from a spatial domain into a frequency domain by an orthogonal transformation device 5101.
A quantization device 5102 quantizes the orthogonal transformation coefficients of the block, which has been transformed into the above frequency domain, with a quantization step size corresponding to the quantization parameter supplied from a quantization control device 5103 for each MB.
In general, the quantization control device 5103 monitors the amount of codes generated. If the amount of codes generated is larger than a target code amount, the quantization control device 5103 increases the quantization parameter. If the amount of codes generated is smaller than the target code amount, the quantization control device 5103 decreases the quantization parameter. This makes it possible to encode a moving image with the target code amount.
The quantized orthogonal transformation coefficient is called a quantized transformation coefficient. This coefficient is entropy-encoded by a variable-length encoder 5104 and is output.
For subsequent encoding, the above quantized transformation coefficient is dequantized by a dequantization device 5105, and is further subjected to inverse orthogonal transformation by an inverse orthogonal transformation device 5106 to be restored to the original spatial domain.
The above prediction value is added to the block restored to the spatial domain, and the resultant data is stored in a frame memory 5107. An image frame reconstructed by the stored block will be referred to as a reference frame.
The intra-frame prediction device 5108 detects a prediction direction in which the prediction error signal of the current MB is minimized from the reference frame. The inter-frame prediction device 5109 detects a motion vector with which the prediction error signal of the current MB is minimized from the reference frame. A prediction determination switch 5110 compares a prediction error due to the above intra-frame prediction with a prediction error due to the inter-frame prediction, and selects a prediction corresponding to a smaller prediction error.
In order to maintain the subjective image quality of a moving image compressed by the above processing, the quantization control device 5103 monitors input image signals and prediction error signals in addition to the amount of codes generated. If the visual sensitivity of an MB to be quantized is high, the quantization control device 5103 decreases the quantization parameter (performs finer quantization). If the visual sensitivity is low, the quantization control device 5103 increases the quantization parameter (performs coarser quantization) (the finer the quantization, the higher the image quality).
In a conventional technique such as AVC, there is a restriction that only one quantization parameter is allowed to be transmitted to one MB in order to reduce the information amount of quantization parameter to be transmitted.
Owing to this restriction, all the orthogonal transformation coefficients (256 coefficients in the case of a luminance signal) of the blocks constituting an MB are quantized with the same quantization width, i.e., the same quantization characteristic.
The conventional technique therefore has the following three problems.
The first problem is that the respective blocks constituting an MB do not necessarily have the same pattern. In such a case, the conventional technique cannot perform quantization suitable for the pattern of each block constituting an MB.
The second problem is that in moving image encoding operation in which each block constituting an MB allows independent intra-frame prediction or each block constituting an MB allows inter-frame prediction using an independent vector, the performance of minimizing a prediction error (to be referred to as prediction performance hereinafter) varies for each block constituting an MB. In such a case, the conventional technique cannot perform quantization suitable for the prediction performance of each block constituting an MB.
The third problem is that the distribution of orthogonal transformation coefficients corresponding to the coordinates (to be referred to as spatial frequencies hereinafter) in a block varies due to the first and second reasons, and the respective blocks constituting an MB do not exhibit a uniform distribution. In such a case, the conventional technique cannot perform quantization suitable for the distribution of the orthogonal transformation coefficients of each block.
Owing to these problems, in the conventional technique, a quantization parameter for an MB cannot help but be determined in accordance with the highest visual sensitivity in a frequency domain in the MB or a block exhibiting the highest visual sensitivity in a spatial zone in the MB. As a consequence, other transformation coefficients exhibiting low visual sensitivity in a frequency domain, or a block exhibiting low visual sensitivity in a spatial domain is quantized more finely than necessary. That is, unnecessary information amounts are assigned to transformation coefficients exhibiting low visual sensitivity.
Japanese Patent Laid-Open No. 2003-230142 (reference 1) discloses a technique of improving the average subjective image quality of an entire image frame without transmitting any quantization characteristic additional information by clipping high-frequency transformation coefficients of the transformation coefficients in all the blocks constituting an MB in intra-frame prediction more than low-frequency transformation coefficients, and inhibiting the above coefficients from being clipped in inter-frame prediction.