Nowadays, the amount of data transmitted in the form of a moving picture is increasing day by day. For example, let's consider the amount of data of an analog television. Currently, in the case of digitizing Japanese standard television broadcasting, the number of pixels is 720 in a horizontal direction and is 480 in a vertical direction. Each pixel has a luminance component of 8 bits and two chrominance components of 8 bits. A moving picture has stage main body 30 frames per one second. Currently, since a data ratio of a chrominance component to the luminance component is 1/2, the amount of data for one second is 720×480×(8+8×1/2+8×1/2)×30=124,416,000 bits and a transmission rate of about 120 Mbps is required.
Further, an optical fiber currently supplied as a home broadband has a transmission rate of about 100 Mbps and thus an image cannot be transmitted without compression. The amount of data of terrestrial digital television broadcasting to replace in 2011 is known as 1.5 Gbps. Accordingly, a highly efficient compression technology may be regarded as one of technologies required in the future. Currently, H.264/AVC (hereinafter, referred to as H.264) is suggested as the standard of the highly efficient compression technology. H.264 is the up-to-date international standard of moving picture coding developed by the joint video team (JVT) commonly established in December, 2001 by the video coding experts group (VCEG) of the international telecommunication union telecommunication standardization sector (ITU-T) and the moving picture experts group (MPEG) of the international organization for standardization (ISO)/international electro-technical commission (IEC).
ITU-T recommendations were admitted in May, 2003. In addition, the ISO/IEC/joint technical committee (JTC) 1 was standardized as MPEG-4 part 10 advanced video coding (AVC) in 2003.
H.264 is characterized in that the same picture quality can be realized by coding efficiency which is about twice as high as that of the conventional MPEG-2 and MPEG-4, that inter frame prediction, quantization, and entropy coding are adopted as a compression algorithm, and that H.264 can be widely used not only at a low bit rate of a mobile telephone or the like but also at a high bit rate of a high vision TV or the like.
In addition, the ITU-T recommendations can be downloaded from the URL stated in the following Non-Patent Document 1.    [Non-Patent Document 1] “ITU-T Recommendation H.264 Advanced video coding for generic audiovisual services”, [online], November 2007, TELECOMMUNICATION STANDARDIZATION SECTOR OF ITU [searched on Dec. 12, 2008], the Internet <URL: http://www.itu.int/rec/T-REC-H.264-200711-I/en>
In order to describe problems to be solved by the present invention, a prediction method of H.264 will be simply described with reference to FIGS. 1 to 3B.
In H.264, intra-prediction 104 for generating an intra-prediction image predicted by using correlations within a picture and inter-prediction 105 for generating an inter-prediction image predicted by using correlations between pictures are performed. A difference between the generated prediction image and an input picture 101 is obtained, and orthogonal transform, e.g., discrete cosine transform (DCT), 102 and quantization (Q) 103 are performed on the differential data. Then, coding 110 is performed on the quantized data. In H.264, only the differential data is coded and transmitted, thereby realizing high coding efficiency.
Here, the reference numeral 107 indicates a deblocking filter standardized in H.264, and the reference numeral 108 is inverse orthogonal transform, e.g., inverse discrete cosine transform (IDCT), for performing an inverse processing to the processing of the orthogonal transform 102. Further, the reference numeral 109 indicates inverse quantization (IQ) for performing an inverse processing to the processing of the quantization 103. The filter 107, the inverse orthogonal transform 108 and the inverse quantization 109 perform the processing to obtain reconstructed pictures in an encoder. The reconstructed pictures for a plurality of previous frames are stored in a frame memory 106 and are retrieved to the inter-prediction 105.
The intra-prediction generates the prediction picture based on a correlation between adjacent pixels. In the intra-prediction, the prediction picture is generated by using correlations between a pixel to be predicted and its adjacent pixels, wherein pixels in a left column and an upper row of a block to be predicted are used. In FIG. 2, for example, reference pixels used for generating a prediction picture of 4×4 intra-prediction are illustrated.
In H.264/AVC, it is possible to generate prediction pictures on a basis of block of 4×4 pixels (hereinafter, referred to as 4×4 block), 8×8 pixels (hereinafter, referred to as 8×8 block), or 16×16 pixels (hereinafter, referred to as 16×16 block). As available modes, total 22 modes (9 modes in 4×4 blocks, 9 modes in 8×8 blocks and 4 modes in 16×16 blocks) can be used.
The intra-prediction modes of H.264/AVC in the respective blocks are illustrated in the following Table 1.
TABLE 1Intra-prediction ModesIntra 4× process chamber4/Intra 8 × 8Intra 16 × 160Vertical0Vertical1Horizontal1Horizontal2DC2DC3Diagonal Down Left3Plane4Diagonal Down Right5Vertical Right6Horizontal Down7Vertical Left8Horizontal Up
In the modes 0 and 1, prediction is performed by using adjacent pixels. It is possible to obtain high prediction efficiency for blocks including vertical edges and horizontal edges. In the mode 2, an average value of adjacent pixels is used. In the modes 3 to 8, a weight average is obtained from every 2 to 3 pixels from adjacent pixels and is used as a prediction value. It is possible to obtain a high prediction effect for images including edges of 45 degrees to the left, 45 degrees to the right, 22.5 degrees to the right, 67.5 degrees to the right, 22.5 degrees to the left, and 112.5 degrees to the right, letting the vertically downward direction be 0 degree. In H.264, it is possible to realize highly efficient coding by selecting a proper mode from the intra-prediction modes of the images. In general, rough intra-prediction is performed to select an optimal intra-prediction mode.
In addition, although not described in detail herein, in the inter-prediction that is defined in H.264/AVC, a motion vector of a pixel to be predicted is calculated from previous and future pictures to thereby generate a prediction picture.
The adjacent pixels referred to in the intra-prediction are A to M illustrated in FIG. 2. However, when a picture edge, a slice boundary, and reference pixels are coded by the inter-prediction, reference pixels do not exist. Further, since reference beyond the slice boundary is prohibited, available modes are limited. In addition, in H.264, the intra-prediction is performed in the order of the numbers illustrated in FIGS. 3A and 3B.
The reference pixels used in the respective prediction modes are illustrated in the following Table 2.
TABLE 2Prediction Modes and Available Reference PixelsIntra 4 × 4/AvailableAvailableIntra 8 × 8Reference PixelsIntra 16 × 16Reference Pixels0VerticalUpper0VerticalUpper1HorizontalLeft1HorizontalLeft2DCUpper/Left2DCUpper/Left3DiagonalUpper/3PlaneUpper/Left/Down LeftUpper RightUpper Left4DiagonalUpper/Left/Down RightUpper Left5VerticalUpper/Left/RightUpper Left6HorizontalUpper/Left/DownUpper Left7VerticalUpper/LeftUpper Right8HorizontalLeftUp
As can be seen from the reference pixels used in Table 2, in the case of the 4×4 intra-prediction, since the pixels on the left/upper left do not exist at the picture edge, the modes 1, 4, 5, 6 and 8 cannot be used. Further, when the upper end of the block to be predicted is a slice boundary, the modes 0, 3, 4, 5, 6 and 7 cannot be used since the reference pixels on the upper/upper right are outside the slice boundary. In the case of the 8×8 intra-prediction, in the same way as in the 4×4 intra-prediction, 9 intra-prediction modes are defined and mode limitations due to the pixels that cannot be referred to are the same as those in the 4×4 intra-prediction. In the case of the 16×16 intra-prediction, an available mode is the mode 4 and reference pixels also do not exist in case of a picture edge and the slice boundary and reference beyond the slice boundary is also prohibited.
Further, in other cases than the above, when the reference pixels required in generating the prediction picture of the pixel block to be predicted, i.e., adjacent pixel blocks, are coded by the inter-prediction (when constrained_intra_pred_flag is ‘1’ in H.264), it is defined that an intra-prediction picture cannot be generated with reference to such adjacent blocks.
As described above, when coding is performed based on a conventional method, limitations on available modes are generated, thereby deteriorating the accuracy of the generated prediction picture. Further, a difference value between the prediction picture and an input picture increases due to the deterioration of the accuracy of the prediction picture. As a result, in the coding 110 of FIG. 2, the amount of codes required for coding the blocks to be predicted, in which limitations on modes are generated, increases.
In the range where a transmission band is limited, particularly, in low bit rate transmission, an increase in the amount of generated codes affects entire coding.