This application claims the priority of Korean Patent Application No. 2003-25528, filed on Apr. 22, 2003, in the Korean Intellectual Property Office, the disclosure of which is incorporated herein in its entirety by reference.
1. Field of the Invention
The present invention relates to a codec for video data, and more particularly, to an apparatus and method for determining a prediction mode, which are used in a video codec.
2. Description of the Related Art
Broadcast television and home entertainment have been revolutionized by the advent of digital TV and DVD-video. These applications and many more were made possible by the standardization of video compression technology. The next standard in the MPEG series, MPEG4-visual, is enabling a new generation of internet-based video applications whilst the ITU-T H.263 standard for video compression is now widely used in videoconferencing systems.
MPEG4-visual and H.263 are standards that are based on video compression. The groups responsible for these standards, the Motion Picture Experts Group and the Video Coding Experts Group (MPEG and VCEG) are in the final stages of developing a new standard that promises to significantly outperform MPEG4 and H.263, providing better compression of video images together with a range of features supporting high-quality, low bit-rate streaming video.
After finalizing the original H.263 standard, the ITU-T Video Coding Experts Group (VCEG) started work on two further development areas: short-term efforts to add extra features to H.263 (resulting in Version 2 of the standard) and long-term efforts to develop a new standard for low bit-rate visual communications. The long-term effort led to the draft H.26L standard, offering significantly better video compression efficiency than previous ITU-T standards. The ISO Motion Picture Experts Group (MPEG) recognized the potential benefits of H.26L and the Joint Video Team (JVT) was formed, including experts from MPEG and VCEG. The main task of the JVT is to develop the draft H.26 model into a full International Standard. In fact, the outcome will be two identical standards: ISO MPEG4 Part 10 of MPEG4 and ITU-T H.264. The title of the new standard is Advanced Video Coding (AVC); however, it is widely known by its old working title, H.264.
FIG. 1 is a block diagram of an H.264 encoder.
The H.264 encoder includes a prediction unit 110, a transform and quantization unit 120, and an entropy coding unit 130.
The prediction unit 110 performs inter prediction and intra prediction. Inter prediction is prediction for a block of a present picture using a reference picture which is subjected to decoding and deblocking filtering and stored in a buffer. That is, inter prediction is prediction using several pictures of data. To perform such inter prediction, the prediction unit 110 includes a motion estimator 111 and a motion compensator 112. Intra prediction predicts a predetermined block on a decoded picture using pixel data of its adjacent blocks.
The transform and quantization unit 120 transforms and quantizes a prediction sample obtained from the prediction unit 110. The entropy coding unit 130 encodes the quantized result into an H.264 bit stream according to a predetermined format.
FIG. 2 is a block diagram of an H.264 decoder.
The H.264 decoder receives and entropy-decodes a bit stream encoded by the H.264 encoder, performs dequantization and inverse-transformation of the decoded result, and then decodes the result using reference picture information subjected to motion compensation or intra prediction.
FIG. 3 shows a luminance block P to be predicted and its adjacent blocks to be used for prediction of the luminance block P.
If blocks or macroblocks on a picture have been encoded in an intra mode, a block P (310) to be predicted can be predicted using its adjacent decoded blocks A through L. Prediction is performed for chrominance blocks Cb and Cr as well as for luminance (hereinafter, briefly referred to as “luma”) blocks, however, for convenience of descriptions, prediction for only luma blocks is described in the present invention. The luma prediction block P (310) is a 16×16 block consisting of several 4×4 blocks. In FIG. 3, small letters a through p are 4×4 blocks to be predicted and capital letters A, B, C, D and I, J, K, L are adjacent blocks to be used for predictions of the 4×4 blocks a through P.
Intra prediction is classified into 4×4 prediction and 16×16 prediction according to the size of a block to be predicted. 4×4 prediction has nine modes and 16×16 prediction has four modes, according to different directions of predictions. When the block P (310) is predicted, prediction samples are obtained in the nine 4×4 prediction modes according to the different directions of predictions, using pixel values of the blocks (A, B, C, D and I, J, K, L) adjacent to the 4×4 blocks to be predicted.
FIG. 4 is a table listing types of intra 4×4 luminance prediction modes.
Referring to FIG. 4, the 4×4 intra luma prediction modes include a vertical mode, a horizontal mode, a DC mode, a diagonal_down_left mode, a diagonal_down_right mode, a vertical_right mode, a horizontal_down mode, a vertical_left mode, and a horizontal_up mode. Directions in which predictions are performed in the respective prediction modes will be described with reference to FIG. 5. Predictions of blocks in the respective modes will be described with reference to FIGS. 6A through 6I.
FIG. 5 shows nine prediction directions for H.264 4×4 intra luminance prediction.
Referring to FIG. 5, a block is predicted in a vertical direction, a horizontal direction, a diagonal direction, etc., each corresponding to a mode type.
FIGS. 6A through 6I are views for describing predictions according to the 4×4 intra luminance prediction modes.
For example, in a mode 0 (vertical mode), 4×4 blocks a, e, i, and m are predicted using a pixel value of a block A; 4×4 blocks b, f, j, and n are predicted using a pixel value of a block B; 4×4 blocks c, g, k, and o are predicted using a pixel value of a block C; and 4×4 blocks d, h, l, and p are predicted using a pixel value of a block D. Predictions according to other modes are disclosed in detail in the H.264 standard.
When H.264 encoding is performed, an optimal mode among the 4×4 intra luma prediction modes is selected and prediction is performed in the optimal mode. Compression efficiency is different according to the mode in which luma prediction for a 4×4 block is performed. To select an optimal mode, a block is predicted in all modes, costs are calculated using a predetermined cost function, and a mode with a smallest cost is selected as the optimal mode. Accordingly, since a block to be predicted should be predicted in all of the nine modes and costs should be calculated respectively in the nine modes, an encoder becomes complicated.