1. Field of the Invention
The present invention relates to an image encoding apparatus and an image encoding method for encoding moving image data, and a program thereof.
2. Related Background Art
The MPEG (Moving Pictures of Experts Group) compression presently adopted by a large number of products is an encoding method combining DCT (Discrete Cosign Transformation), quantization and variable length encoding processes with forward motion compensation inter-frame prediction and bidirectional motion compensation inter-frame prediction. The MPEG normally adopts a GOP (Group of Pictures) structure having fifteen sheets of frame data grouped therein. The GOP is configured by an I picture (Intra-coded picture), a P picture (Predictive-coded picture) and a B picture (Bidirectionally predictive-coded picture). They are video data wherein the I picture is encoded only with information in a picture, the P picture is encoded by performing forward prediction from the I or another P picture, and the B picture is encoded by performing bidirectional prediction from the I or P picture.
An image encoding apparatus corresponding to the MPEG comprises a DCT circuit for performing two-dimensional orthogonal transform to blocks divided from an inputted signal on a predetermined number of pixels basis, that is, a so-called DCT block, a quantization circuit for quantizing a DCT coefficient after the transformation, and a rate control portion for controlling a quantization scale code at a proper value in consideration of an output buffer, that is, a so-called VBV buffer.
For instance, as shown in FIG. 4, an image encoding apparatus 400 corresponding to the MPEG2 holds a video signal captured by CCD 401, and comprises a frame buffer 402 for rearranging the pictures and a subtraction circuit 413 for calculating a difference between the video signal inputted from the frame buffer 402 and a signal representing a local decode image. The image encoding apparatus 400 further comprises a DCT circuit 403, a quantization circuit 404, a variable length encoding circuit 409, an inverse quantization circuit 405, an inverse DCT circuit 406, a motion estimation circuit 407, a motion compensation circuit 408, a recording medium 410, a controller 411 for performing rate control and an activity calculation circuit 412.
In the case of the I picture and P picture, they are used as a referential picture in the motion estimation circuit 407 and motion compensation circuit 408. Therefore, an output of the quantization circuit 404 is also inputted to the inverse quantization circuit 405 where it is inversely quantized, and then subjected to an inverse DCT in the inverse DCT circuit 406. The output of the IDCT circuit 406 is inputted to the motion compensation circuit 408 to be sequentially processed. The motion estimation circuit 407 and motion compensation circuit 408 perform the forward prediction, backward prediction and bidirectional prediction, and output a local-decoded signal to the subtraction circuit 413.
The subtraction circuit 413 is a circuit for performing a subtraction process between the output of the frame buffer 402 and the output of the motion compensation circuit 408 so as to calculate a difference value. In the case of inputting the I picture to be intra-picture-encoded, the video signal simply passes from the frame buffer 402 with no subtraction process performed in the subtraction circuit 413.
The I picture, or the P and B pictures represented by the difference values undergo the DCT in the DCT circuit 403, and then are quantized in the quantization circuit 404, variable-length-encoded in the variable length encoding circuit 409 and recorded on the recording medium 410.
The quantization scale code to be used in the quantization circuit 404 is decided by using an activity reflecting a reference value of the quantization scale code calculated by the controller 411 and a visual characteristic of a macro block as a unit of quantization. The method of deciding the quantization scale code is configured by the following three steps.
In a first step, an assigned bit amount for each individual picture in the GOP is distributed based on the bit amount assigned to the pictures still not encoded in the GOP including assignment subject pictures. This distribution is repeated in order of the encoded pictures in the GOP so as to set up a picture target bit amount for each individual picture.
In a second step, the reference value of the quantization scale is set up per macro block. To be more specific, in the second step, the reference value of the quantization scale is acquired by feedback control per macro block based on information on a virtual buffer capacity (VBV buffer capacity) obtained from the variable length encoding circuit 409 in order to match the assigned bit amount for each individual picture acquired in the first step with an actual generated bit amount.
In a third step, the quantization scale value is corrected per macro block based on the activity of the macro block in order to reflect the visual characteristic. While maintaining a frame target bit amount, the quantization scale is corrected to be smaller than the reference value as to the macro block of which activity is low and corrected to be larger than the reference value as to the macro block of which activity is high. Thus, adaptive quantizing is performed in consideration of the visual characteristic, in particular, the activity.
Such a basic technique of the MPEG is disclosed in “General Multimedia Selection: MPEG” (Ohm-sha) and “Information Compression Technology for Digital Broadcast and Internet” (Kyoritsu Shuppan Co., Ltd.) for instance. There is also a patent document (Japanese Patent Application Laid-Open No. 2004-194076) which discloses a technique of calculating the activity and applying it to rate control.
As for the encoding processes of the MPEG2 and the newly standardized MPEG4—AVC (also referred to as H. 264), it is prescribed that the quantization is performed on a macro block basis and the DCT is performed on a DCT block basis. Only one quantization scale code is decided for one macro block. Therefore, in the case of the MPEG, the quantization is performed to six DCT blocks (four luminance components and two color difference components) included in one macro block with the same quantization scale code. For that reason, in the case where the DCT blocks including edges and the DCT blocks including no edge are mixed among the DCT blocks configuring one macro block, the quantization is performed with the same quantization scale code while their power distributions are different. This is by no means desirable from a viewpoint of the visual characteristic.
For that reason, the quantization scale code of the DCT blocks including the edges should be set small. If an area having the edges is small in the case of calculating edge information on the macro block basis, however, the edge information is detected to be weak against the size of the macro block. Consequently, there is a possibility to be determined that no edge exists in the macro block. It is also thinkable that, if per macro block, a noise component is falsely detected as the edge. In that case, there is usually no large signal difference as to the falsely detected noise component, and so edge intensity is determined to be low and the quantization scale code is set to a high value so as to degrade the DCT blocks including the edges. For that reason, the degradation of the DCT blocks consequently appears as the degradation of the macro blocks and induces the degradation of image quality.