In an international standard video compression scheme H.264/AVC and a next-generation high efficiency video coding HEVC currently under development, two types of processes such as prediction and transformation are performed so as to efficiently compress information of enormous videos. The prediction is classified into two types such as inter-prediction encoding (referred to also as inter-frame prediction encoding or inter-screen encoding) and intra-prediction encoding (in-frame prediction encoding or in-screen encoding). The inter-frame prediction encoding is a scheme of achieving information compression using correlation in a time domain within a video. Inter-frame prediction using motion compensation is a representative example thereof.
On the other hand, the intra-frame prediction encoding is a scheme of generating a prediction signal using a decoded pixel around an encoding target block as adopted in, for example, H.264/AVC, that is, a scheme of achieving information compression using correlation in a frame. For the transform, a scheme known as a discrete cosine transform (DCT) is adopted in an international standard compression scheme JPEG for a still image and an international standard compression scheme MPEG-2 for a video. In JPEG2000 that is a standard subsequent to JPEG, a scheme known as a discrete wavelet transform (DWT) is adopted. After the prediction is performed between frames and in the frame, a prediction residual signal (referred to as a prediction error signal) is subjected to a transform and quantization and finally converted to a binary signal (bit stream) through entropy encoding.
In H.264/AVC, a macroblock (hereinafter referred to as an MB) is set as a unit of an encoding process, and a size thereof is 16×16. FIG. 20 is a diagram showing a prediction mode of a 4×4 block. In an example of prediction mode 0, a pixel value of M is used for prediction of pixels N0, N1, N2, and N3. Similarly, in the other modes, a prediction signal is generated along a prediction direction from a decoded pixel. FIG. 21 is a diagram showing a prediction mode of a 16×16 block. There are two types of prediction sizes in the intra-prediction encoding of H.264/AVC, as shown in FIGS. 20 and 21. Nine types of prediction modes (prediction modes 0 to 8) are prepared in the 4×4 block, and four types of prediction modes (prediction modes 0 to 3) are prepared in the 16×16 block. In FREXT that is an extended standard of H.264/AVC, an 8×8 prediction size is added in addition thereto, and there are nine types of prediction modes thereof having the same definition as 4×4.
FIGS. 22, 23 and 24 are diagrams showing a relationship between a prediction mode and a prediction number. According to the prediction mode defined in FIGS. 22, 23 and 24, a prediction image is generated, and a residual signal thereof is encoded through a transform and quantization. In the case of decoding, similarly, the prediction signal is generated from the decoded pixel using a decoded prediction mode number, and added to the decoded prediction residual signal to obtain the decoded signal.
On the other hand, in HEVC, a unit of an encoding process is not an MB unit, but the encoding is performed in units of blocks called coding units (CU). A largest unit is called a largest coding unit (LCU), and a 64×64 size is usually set in reference software (HEVC test Model: HM) of HEVC. A 64×64 LCU is partitioned into 8×8 CUs as smallest units on the quadtree basis. Each CU is partitioned into a block in which the prediction is performed, called a prediction unit (PU), and a block in which the transform is performed, called a transform unit (TU). The PU and the TU are independently defined in the CU. The PU and the TU are basically partitioned on the quadtree basis, similar to the CU, but there is a tool that applies partition other than a square. A tool that permits non-square partition of the PU is referred to as asymmetric motion partition (AMP), and a tool that permits non-square partition of the TU is referred to as a non-square quadtree transform (NSQT).
FIG. 25 is a diagram showing a definition of each processing unit in the HEVC. In H.264/AVC, encoding is performed in a 16×16 block, but in HEVC, there is a characteristic in that the encoding is performed in units of blocks having a greater size such as a 64×64 block. Encoding, prediction, and transform processes for a greater size as described above greatly contribute to improvement of coding efficiency, particularly at a high resolution.
In the intra-prediction encoding of HEVC, a new prediction mode or the like is added so as to improve performance of intra-prediction encoding in H.264/AVC. One mode is a prediction mode called planar prediction, and the other mode is a prediction mode realizing a finer prediction direction (angular prediction). FIG. 26 is a diagram showing a corresponding relationship between an intra-prediction mode number and an intra-prediction mode in a prediction unit (common in all block sizes) of HEVC. The corresponding relationship between the prediction mode number and the prediction mode is as shown in FIG. 26. FIGS. 27 and 28 are diagrams showing a correspondence relationship between an angular prediction mode and an angle. A difference in a prediction direction is as shown in FIGS. 27 and 28. Eight directions are prepared in H.264/AVC (FIG. 27), whereas 33 directions (FIG. 28) are prepared for HEVC.
In the planar prediction, when a block size is nT, and a reference signal is p[x][y] (x and y indicate a position of a reference pixel, and an upper left pixel position of the prediction target block corresponds to x=0 and y=0), the prediction signal predSamples[x][y] (x and y are coordinates of a prediction target block, and a top left pixel position corresponds to x=0 and y=0) is defined as in Equation (1).predSamples[x][y]=((nT−1−x)×p[−1][y]+(x+1)×p[nT][−1]+(nT−1−y)×p[x][−1]+(y+1)×p[−1][nT]+nT)>>(Log2(nT)+1)  (1)
The prediction signal can be flexibly generated, particularly, using pixels located at an upper right (p[nT][−1]) and a lower left (p[−1][nT]), and there is a characteristic in that a selected ratio is high in intra-prediction encoding of HEVC.
In the angular prediction, a prediction signal predSamples[x][y] (x and y are coordinates of the prediction target block, and a top upper left pixel position is x=0 and y=0) is generated as follows.
(A) A prediction mode number is equal to or more than 18
1. Arrangement of reference pixels ref[x] (x=−nT, . . . , 2×nT; nT is a block size)ref[x]=p[−1+x][−1](x=0, . . . ,nT)
When an angle (intraPredAngle) corresponding to the prediction mode number defined in FIG. 29 is smaller than 0 and (nT×intraPredAngle)>>5 is smaller than −1:ref[x]=p[−1][−1+((x×invAngle+128)>>8)](x=(nT×intraPredAngle)>>5, . . . ,−1)
Here, a definition of invAngle is as shown in FIG. 30.
FIG. 29 is a diagram showing a correspondence relationship between a prediction mode and an angle. Further, FIG. 30 is a diagram showing a correspondence relationship between the prediction mode and a parameter.
In the other cases:ref[x]=p[1+x][−1](x=nT+1, . . . ,2×nT)
2. Arrangement of reference pixels ref[x] (x=0, . . . , nT−1)
(a) An index (iIdx) and a multiplier parameter (iFact) are defined as follows.iIdx=((y+1)×intraPredAngle)>>5iFact=((y+1)×intraPredAngle)&31
(b) The following process is performed according to a value of iFact.
When iFact is not 0:predSamples[x][y]=((32−iFact)×ref[x+iIdx+1]+iFact×ref[x+iIdx+2]+16)>>5
When iFact is 0:predSamples[x][y]=ref[x+iIdx+1]
(c) The prediction mode number is 26 (vertical prediction): (x=0, y=0, . . . , nT−1)predSamples[x][y]=Clip1Y(p[x][−1]+((p[−1][y]−p[−1][−1])]>>1))
(B) The prediction mode number is less than 18
1. Arrangement of reference pixels ref[x] (x=−nT, . . . , 2×nT)ref[x]p[−1][−1+x](x=0, . . . ,nT)
When intraPredAngle defined in FIG. 29 is smaller than 0 and (nT×intraPredAngle)>>5 is smaller than −1:ref[x]=p[−1+((x×invAngle+128)]>>8)][−1](x=(nT×intraPredAngle))>>5, . . . ,−1)
In the other cases:ref[x]=p[−1][−1+x](x=nT+1, . . . ,2×nT)
2. Arrangement of reference pixels ref[x] (x=0, . . . , nT−1)
(a) An index (iIdx) and a multiplier parameter (iFact) are defined as follows.iIdx=((x+1)×intraPredAngle)>>5iFact=((x+1)×intraPredAngle)&31
(b) The following process is performed according to a value of iFact
When iFact is not 0:predSamples[x][y]=((32−iFact)×ref[y+iIdx+1]+iFact×ref[y+iIdx+2]+16)>>5
When iFact is 0:predSamples[x][y]=ref[y+iIdx+1]]
(c) The prediction mode number is 10 (horizontal prediction): (x=0, . . . , nT−1 and y=0)predSamples[x][y]=Clip1Y(p[−1][y]+(p[x][−1]—p[−1][−1])>>1)
Since such a process enables generation of a fine prediction signal in 33 directions and generation of a flexible prediction signal, performance of the intra-prediction encoding is improved relative to H.264/AVC. The matters described above have been described in detail in Non-Patent Literatures 1, 2 and 3.