The statements in this section merely provide background information related to the present disclosure and may not constitute prior art.
Video data needs encoding into a compression to effectively store or transmit the data. There are known video data compression technologies such as H.261, H.263, H.264, MPEG-2 and MPEG-4. Such compression technologies divide each frame image data of a video into luminance or luma components and chrominance or chroma components which are then subdivided by unit of macroblocks for carrying out encoding process.
The most recent video compression technology, H.264/AVC is a joint development by MPEG (Moving Picture Experts Group) and VCEG (Video Coding Experts Group) and is improved over the existing video compressing method that is MPEG-4 to provide a better compression ratio while maintaining the quality of the video.
FIG. 1 is a block diagram showing an existing H.264/AVC video encoder. The existing H.264/AVC video encoder shown in FIG. 1 performs inter-prediction, intra-prediction, transform, quantization, entropy coding, or the like, to encode input image data. Data with the redundancy removed is compressed through transform and quantization processes. The compressed data is entropy-encoded into a bitstream.
As shown in FIG. 1, H.264 processes a current image 110 at input by inter-predicting at 130 referring to a reference image 120 and then motion-compensating at 140 to predict pixel values of a current block, or has the current image intra-predicted at 150 and used for predicting the pixel values of the current block with their spatial redundancies removed. The differences of the predicted values from the current image are used as the basis for calculating the prediction errors which are then transformed and quantized at 160 and entropy-encoded at 170 into a bitstream for transmission. At the same time, the transformed and quantized values undergo an inverse transform and inverse quantization process at 180 and add to the predicted values, and the sum goes through a filtering at 190 for reducing the blocking artifacts before composing a reconstructed image at 195.
Here, the inter prediction process is to remove temporal redundancy, whereby the macroblock is subdivided into smaller blocks sized 16×16, 16×8, 8×16, 8×8, 8×4, 4×8, 4×4 to be motion predicted and motion compensated, when there may be up to sixteen reference frames used for the motion prediction and motion compensation. In addition, H.264/AVC may utilize ¼ pixel interpolation to improve the compression ratio while maintaining the image quality when compared to the existing MPEG-4 video compression technology.
The intra-prediction process is to remove spatial redundancy, whereby the macroblock is subdivided in units of 16×16, 8×8, and 4×4 blocks to be intra-predicted. More specifically, the intra-prediction mode carried out in units of 16×16 blocks is referred to as 16×16 intra-prediction, and the intra-prediction mode carried out in units of 16×16 blocks is referred to as a 16×16 intra-prediction. The intra-prediction carried out by dividing the macroblock in units of four 8×8 blocks is referred to as 8×8 intra-prediction. The intra-prediction mode carried out by dividing the macroblock into sixteen 4×4 blocks is referred to as 4×4 intra-prediction.
In the case of I and SI pictures on which only the intra-prediction is applicable, the optimal intra-prediction mode is an optimal mode of a current macroblock. On the other hand, in the case of S, SP, B pictures on which both the intra-prediction and the inter-prediction are applicable, a mode having the lowest cost, either the intra-prediction mode or the inter-prediction mode, is selected as an optimal mode. Generally, a cost for each mode is calculated using an H.264 rate-cost function as expressed in Equation 1.J(i)=D(i)+λ×R(i)  Equation 1
In Equation 1, D(i) represents an error occurring in a mode i, and R(i) represents a bit generated when encoded in the mode i. λ represents a Lagrange multiplier.
An optimal mode is determined among the possible modes by using Equation 2.
                              i          *                =                                            arg              ⁢                                                          ⁢              min                                      i              ∈              C                                ⁢                      {                          J              ⁡                              (                i                )                                      }                                              Equation        ⁢                                  ⁢        2            
In Equation 2, C represents an available mode. If the type of the picture to be currently coded is P, the P picture has a mode of Cε{SKIP, 16×16, 16×8, 8×16, P8×8, I4, I8, I16}|.
When the optimal prediction mode is determined, the existing H.264/AVC video encoder transforms and quantizes a difference value between a block to be encoded and a predicted block, which is then entropy-coded. At this time, the difference value is entropy-coded in the selected optimal prediction mode.
The next-generation encoding standard aiming to efficiently encode high-resolution images is planned to increase the coding unit block size to 32×32 or 64×64, which is larger than the existing coding unit block size 16×16. In encoding high resolution videos, if the coding unit block size increases, spatial redundancy in a block increases. Therefore, there is a need for a method that can effectively remove the spatial redundancy.