High efficiency video coding (HEVC) developed as a next video coding standard scheme of a standard H.264, which is a conventional art, is a coding scheme for realizing higher coding efficiency than the standard H.264. However, the HEVC involves complicated processes and large computational complexity, and thus the HEVC has a problem that a computational cost of coding should be significantly reduced in order to employ the HEVC in actual products. In particular, in the HEVC, since options for the size of a block, which is a unit of coding, increase, processes associated with optimization of coding in which an optimal block size and an optimal mode are determined increase and thus it is necessary to reduce the computational complexity thereof.
In video coding, pixels are generally coded in units of blocks. In the HEVC, concepts of a largest coding unit (LCU) and a coding unit (CU) are introduced as blocks for which coding is performed, and a concept of a prediction unit (PU) is introduced as a unit of prediction. Hereinafter, the LCU, the CU, and the PU will be described. In the HEVC, one picture (image) is equally divided into squares having a given size, as illustrated in FIG. 11. FIG. 11 is a diagram illustrating an example of division of a picture. The square blocks illustrated in FIG. 11 are referred to as LCUs. The sizes of the LCUs can be selected from among 16×16 pixels, 32×32 pixels, and 64×64 pixels, but are all the same within one picture.
A CU, which is a unit for performing actual coding, is a square block that exists in one LCU, and a prediction mode is determined using the CU as a unit. The CU is a block having the same size as the LCU, or, as illustrated in FIG. 11, the CU is a square block obtained by equally dividing the LCU in four or a square block obtained by repeatedly applying such a quadrisection process to the LCU. When the size of the LCU is (2N)×(2N), the size of the CU can be selected from among (2M)×(2M) (3≤M≤N, where N and M are integers).
Further, the PU is a unit for setting a prediction direction in the CU. FIGS. 12A and 12B are diagrams illustrating examples of options for division into PUs (PU division types). As illustrated in FIG. 12A, in inter-prediction, selection from eight division methods is possible. Further, as illustrated in FIG. 12B, in intra-prediction, selection from two division methods is possible only when a CU size is 8×8. In other cases, the PU is the same as the CU. For improvement of coding efficiency, it is necessary to perform division into CUs and setting of a prediction mode of each CU suitable for each LCU. In particular, while a degree of freedom of selection of the division method of the CU and the prediction mode increases as the size of the LCU increases, and improvement of coding efficiency is expected by performing appropriate division into CUs and appropriate setting of a prediction mode, there is a problem in that it is necessary to perform a computation on all CUs that can be options and thus the computational complexity increases accordingly.
An actual process regarding selection of the CU size in one LCU will be described using an example of an HEVC test model (HM). An operation of a coding cost calculation process for a determination of shapes of divided PUs in one CU will first be described with reference to FIG. 13. FIG. 13 is a flowchart illustrating an operation of a coding cost calculation process for a determination of shapes of divided PUs in one CU. First, inter-prediction is performed for each PU division type (step S71). Then, if a coding cost is the lowest, the lowest coding cost, and shapes of the divided PUs and an inter-prediction direction that realize the lowest coding cost are stored (step S72).
Then, intra-prediction is performed for each PU division type (step S73). Then, if a coding cost is the lowest, the lowest coding cost, and shapes of divided PUs and an intra-prediction direction that realize the lowest coding cost are stored (step S74).
Next, an operation of a coding cost calculation process for a determination of the CU division size in one LCU will be described with reference to FIG. 14. FIG. 14 is a flowchart illustrating the operation of the coding cost calculation process for a selection of the CU division size in one LCU. First, steps S71 to S74 illustrated in FIG. 13 are repeated in one CU to calculate coding costs for CUs of all sizes and determine CU sizes (step S75). For an LCU, the coding cost when a CU is divided into four is compared with the coding cost when the CU is not divided into four, the division size of the CU in which the coding cost is the lowest is selected, and the selected division size is stored (step S76).
It is to be noted that a QP loop, which is not described above, is illustrated in FIG. 14. In the HEVC, a quantization parameter QP that determines a quantization step size can be set for each CU if encoding is not performed using a fixed quantization parameter QP. In order to realize this, a structure of calculating an optimal coding cost while changing the value of the quantization parameter QP is included in reference software. Therefore, the QP loop is illustrated in FIG. 14.
A process of selecting an optimal mode, division shape, and division size to obtain the lowest coding cost is referred to as a coding optimization process. It is to be noted that while representations “LCU” and “CU” are used hereinafter as concepts of units of processing in the HEVC, units corresponding to the LCU and the CU are more generally represented as a block and a sub-block, respectively, taking other coding technologies into consideration.
Next, an operation of the coding optimization process will be described. FIG. 15 is a flowchart illustrating the operation of the coding optimization process. First, coding costs of an inter-prediction mode, a skip mode, and an intra-prediction mode are calculated (step S77). Subsequently, a mode in which the coding cost is the lowest is used as a prediction mode for each CU size, and an optimal mode is set and stored (step S78). This process is repeated for all the CU sizes. Then, a combination of divided CUs in which the coding cost is the lowest is determined (step S79).
Here, the coding cost means an RD cost as represented by Equation (1), and the optimization means a determination of coding by which the coding cost is the lowest.RD cost=D+λR  (1)
In Equation (1), D indicates a sum of squared errors between decoded pixels and pixels of an original image, R is a generated bit amount, and λ is a Lagrangian parameter. Further, in order to speed up the coding optimization process, there is also a technique in which a pseudo RD cost in which D is replaced with a sum D′ of absolute differences between predicted pixels and the pixels of the original image and R is replaced with a generated bit amount R′ other than a bit amount of coefficients is used as the coding cost. Hereinafter, the coding cost is assumed to mean a cost as represented by the RD cost or the pseudo RD cost.
Next, an operation of a process for determining an optimal prediction direction of intra-prediction will be described with reference to FIG. 16. FIG. 16 is a flowchart illustrating the operation of the process for determining the optimal prediction direction of the intra-prediction. A process of steps S82 to S85 to be described below is repeatedly performed while a certain prediction direction is selected (step S81). First, an intra-predicted image in the prediction direction is generated using neighboring encoded pixels (step S82). Subsequently, an error between pixels of the intra-predicted image and pixels of the original image is calculated (step S83).
Then, a generated bit amount is calculated (step S84), and a coding cost is calculated from the error and the generated bit amount (step S85). Then, steps S82 to S85 are repeated for each prediction direction. Finally, a direction in which the calculated coding cost is the lowest is set as an intra-prediction direction to thereby set an optimal intra-prediction direction (step S86).
In the coding optimization process, a block having a certain fixed size is divided into sub-blocks having an optimal sub-block size, and an optimal prediction mode is determined for each sub-block. In order to determine the optimal sub-block size and the optimal prediction mode in units of blocks, it is necessary to obtain a coding cost in each prediction mode for each sub-block size, as described above. Particularly, for a determination of the optimal intra-prediction direction, it is necessary to perform decoding on neighboring encoded pixels necessary for generation of a predicted image for each intra-prediction direction. These processes are also performed on a sub-block that does not have an optimal size. However, in this case, a result of decoding the neighboring pixels is used only for comparison of the coding costs.
Particularly, in the case of a determination using a pseudo RD cost, which is one of techniques aiming at speeding up the coding optimization process, it is not necessary to encode a prediction error when calculating R′; however, because a predicted image is used to calculate D′, it is necessary to encode and decode neighboring pixels used for the calculation. Therefore, when an optimal coding cost of each sub-block size is determined through only speeding-up using a pseudo RD cost, an actual reduction of the computation process is achieved in only an arithmetic coding process of a prediction error coefficient, and it is necessary to perform a decoding process in which the computational complexity is large for each sub-block size.
It is to be noted that it is assumed that a case in which the generated bit amount is described hereinbelow includes a case in which a generated bit amount of coefficients is included and a case in which the generated bit amount of the coefficients is not included.
Non-Patent Document 1 is an example in which speeding-up using the conventional art is performed. Non-Patent Document 1 introduces a technique in which pixels of an original image or pixels obtained by applying a filter to the pixels of the original image are used as pixels of a pseudo intra-predicted image. With this technique, reference pixels used for calculation of an optimal intra-prediction mode of each size of a sub-block are only the pixels of the original image or the pixels obtained by applying the filter to the pixels of the original image, and thus decoding of neighboring pixels is not necessary and the computational complexity can be reduced.