(1) Field of the Invention
The invention relates to an image coding/recording apparatus which codes and records image data, and more particularly to an image coding/recording apparatus which codes image data using H.264 image coding method and records the coded data onto a recording medium.
(2) Description of the Related Art
With the development of digital imaging technologies, data compression technologies have been utilized and developed for digital video data, in order to address increases in the data amount. One example is the development in a data compression technique specialized for image data, which takes advantage of the characteristics of such image data. Moreover, recent improvements in the data processing ability of computers allow complicated computations in compression technologies, so that compression rates of video data have been significantly enhanced. For example, the MPEG-2 standard is one of such compression technologies and is employed in satellite and terrestrial digital high definition (HD) broadcasting.
A further developed image compression technology following the MPEG-2 standard is H.264 standard realizing a compression rate two times higher than the MPEG-2 standard. The H.264 image coding method is the same as the conventional MPEG coding method, since both methods are hybrid image coding methods using orthogonal transformation and motion compensation. However, the H.264 standard has higher flexibility of coding tools of elements for the coding, and the cumulative effects of the coding tools enable high coding efficiency to be realized.
FIG. 1 is a functional block diagram showing an apparatus by which the H.264 image coding is realized (hereinafter, referred to also as an “H.264 image coding apparatus”). As shown in FIG. 1, the H.264 image coding apparatus includes an analog/digital (A/D) converter 11, a picture sorting buffer 12, a macroblock dividing unit 13, a subtraction operation unit 14, an orthogonal transformation unit 15, a quantization unit 16, an entropy coding unit 17, an accumulation buffer 18, an inverse-quantization unit 19, an inverse orthogonal transformation unit 20, an addition operation unit 21, a picture memory 22, an intra-picture (intra-frame) prediction unit 23, an inter-picture (inter-frame) prediction unit 24, a prediction selection unit 25, and a rate control unit 26.
Video signals inputted into the H.264 image coding apparatus are converted by the A/D converter 11, from analog signals to digital video signals. The digital video signals include a brightness signal Y and color-difference signals Cb and Cr. Pictures (frames) of the video signals are sorted by the picture sorting buffer 12, from an order of inputted (displaying) pictures to an order of coding pictures, as shown in FIG. 2. The order of coding pictures is determined depending on a structure of a Group of Pictures (GOP) which includes I, P, and B pictures to be coded.
I picture, which is one of the types of the pictures to be coded, is a picture to be coded by intra-picture (intra-frame) prediction coding, using only the I picture itself without any reference pictures. P picture is a picture to be coded by inter-picture (inter-frame) prediction coding, referring to a single coded picture. B picture is a picture to be coded by inter-picture prediction coding, referring to two or more coded pictures at the same time.
Each picture in a series of moving pictures is one unit to be coded, and such a picture is equivalent to a frame or a field. In the case of 4:2:0 format, one picture includes one brightness signal (Y signal 31) and two color-difference signals (Cb signal 32 and Cr signal 33) as shown in FIG. 3. An image size of a color-difference signal is vertically and horizontally ½ of a brightness signal.
Moreover, each picture in moving pictures is divided into blocks called macroblocks, and then coded on a macroblock-by-macroblock basis. As shown in FIG. 4, one macroblock includes: an Y signal block 41 of 16×16 pixels; and a Cb signal block 42 and a Cr signal block 43, each of which is a 8×8-pixel block spatially corresponding to the Y signal block 41 (see ITU-T Recommendation H.264, for example).
Each of the inputted pictures is divided by the macroblock dividing unit 13 into input macroblocks. The macroblocks are inputted to the subtraction operation unit 14. The subtraction operation unit 14 performs difference processing to detect difference between: each of pixels in the input macroblock; and a spatially corresponding pixel in a prediction macroblock which is generated by the intra-picture prediction unit 23 or the inter-picture prediction unit 24. Thereby, the subtraction operation unit 14 outputs a difference macroblock.
The difference macroblock is provided to the orthogonal transformation unit 15 to be applied with frequency conversion to be converted to a plurality of orthogonal transformation blocks. A size of one orthogonal transformation block is 8×8 pixels in the conventional MPEG method, but in the H.264 standard, the basic size is 4×4 pixels.
The orthogonal transformation unit 15 divides, as shown in FIG. 5, the difference macroblock into 24 blocks (51-0 to 51-15, 52-0 to 52-3, and 53-0 to 53-3) each having 4×4 pixels. The orthogonal transformation unit 15 performs orthogonal transformation for each of the blocks.
The quantization unit 16 quantizes orthogonal transformation coefficients in each orthogonal transformation block, according to quantization parameters obtained from the rate control unit 26. The quantized orthogonal transformation coefficients are provided to the entropy coding unit 17 which codes the coefficients.
The entropy coding unit 17 codes the quantized orthogonal transformation coefficients and prediction information which is selected by the prediction selection unit 25 as described further below, and provides the coded date to the accumulation buffer 18. The accumulation buffer 18 outputs the accumulated coded data as a stream.
The quantized orthogonal transformation coefficients are provided also to the inverse-quantization unit 19, as well as the entropy coding unit 17. According to the quantization parameters obtained from the rate control unit 26, the inverse-quantization unit 19 performs inverse quantization for the quantized orthogonal transformation coefficients. Thereby, the orthogonal transformation block is reconstructed from the quantized orthogonal transformation block. The inverse orthogonal transformation unit 20 re-constructs the difference macroblock from the reconstructed orthogonal transformation block. The reconstructed difference macroblock as well as the prediction macroblock are provided to the addition operation unit 21.
The addition operation unit 21 performs addition processing for pixels of the reconstructed difference macroblock and the prediction macroblock, thereby generating a reproduction macroblock. This reproduction macroblock is accumulated into the picture memory 22 to be further utilized for prediction processing.
The above-explained series of processing performed by the inverse-quantization unit 19, the inverse orthogonal transformation unit 20, and the addition operation unit 21 is called local decoding. This local decoding needs capability of generating a reproduction macroblock which should be the same as the data that will be obtained by decoding the macroblock by a decoding apparatus.
There are two different methods for predicting a prediction macroblock: intra-picture prediction and inter-picture prediction. The intra-picture prediction is a method for predicting pixels in the macroblock, using coded pixels in a picture which has the macroblock. The H.264 standard has two different units applied with this prediction: a 4×4 block and a 16×16 block.
On the other hand, the inter-frame prediction is a method for predicting pixels in the macroblock, using pixels in a different coded picture. Pictures to be applied with this inter-frame prediction are P pictures and B pictures. Note that the pixels in the coded picture are read out from the picture memory 22. Note also that a target macroblock which is currently to be coded is a macroblock outputted from the macroblock dividing unit 13. More specifically, this inter-frame prediction includes motion estimation and motion compensation. In the motion estimation, a motion vector is calculated by detecting a portion similar to a portion of the target macroblock from the coded picture (reference picture). In the motion compensation, a prediction block is generated using the calculated motion vector and the reference picture. In the motion compensation of the H.264 standard, there are various block sizes for motion vector calculation, so that it is possible to select a block size which results in minimum difference from the coded reference picture.
The prediction selection unit 25 compares a macroblock of an original image with prediction image (prediction macroblocks) which are predicted by the intra-picture prediction unit 23 and the inter-picture prediction unit 24, and then selects a prediction macroblock having minimum difference from the macroblock of the original image.
The selected prediction macroblock is provided to the subtraction operation unit 14 and the addition operation unit 21. Furthermore, prediction information such as the selected prediction method (a prediction macroblock, a motion vector, and a reference picture number which are selected by intra-picture prediction or inter-picture prediction) is provided to the entropy coding unit 17.
In the meanwhile, in the H.264 standard, Context-based Adaptive Binary Arithmetic Coding (CABAC) is adapted as entropy coding in order to increase compression rate more than the conventionally variable length coding. The CABAC processing mainly includes: converting of multi-value data to binary data; and arithmetic coding by calculating a context of the binary data.
The arithmetic coding extracts a context for each to-be-compressed code. The context is switched to another, depending on a current target image or surrounding circumstances. Then, possibilities of generation of binarized symbols 0 and 1 are changed for each extracted context, and a generation possibility table is updated depending on the arithmetic coded values. Therefore, pipelining processing or speculative processing become difficult, and it is necessary to increase a speed of the CABAC processing itself (clock) for speeding up those processing.
Here, in the CABAC of the H.264 standard, a maximum coding amount per macroblock (hereinafter, referred to as a “maximum MB coding amount”) is limited to 3200 bits in the case of 4:2:0 format and 8 bit_depth. Therefore, when a coding amount of a macroblock exceeds 3200 bits, it is necessary to change conditions for coding the macroblock in order to re-code the macroblock, so that the coding amount becomes equal to or less than 3200 bits. However, a coding amount in the CABAC is changed depending on context change or generation possibility table update, so that the exact coding amount is not known until actual coding is completed. Therefore, whether or not the macroblock is to be re-coded is determined by checking a coding amount of the macroblock after the CABAC processing.
For the above reasons, a method has been proposed to monitor input and output data of the arithmetic coding unit, and if a coding amount of a macroblock becomes nearly over 3200 bits, switch the data to Intra Macroblock Pulse Coding Modulation (I_PCM) data that is non-compressed digital image data, as disclosed in Japanese Unexamined Patent Application Publication No. 2004-135251, for example. In this method, by estimating a generated coding amount prior to completion of arithmetic coding, it is possible to prevent occurrence of a macroblok nearly over 3200 bits, so that re-coding of such a macroblock can be prevented.