1. Technical Field
Example embodiments of the present invention relates to a video compression method and, more particularly, to a method and apparatus for improving compression efficiency in directional intra-prediction.
2. Related Art
With the development of telecommunication technologies including the Internet, video communication is increasing in addition to text communication and voice communication. It is insufficient to satisfy various desires of consumers with existing text-based communication. Thus, multimedia service capable of covering various types of information such as text, image, music, etc. is increasing. Multimedia data requires a high-capacity storage medium due to its enormous volume, and a wide bandwidth when transmitted. Thus, to transmit the multimedia data including text, video, and audio, it is essential to use a compression coding technique.
A fundamental principle of compressing data is based on a process of eliminating the redundancy from data. The data can be compressed by eliminating spatial redundancy referring to repetition of the same color or object in an image, temporal redundancy referring to little or nothing of variation between neighboring frames in a moving picture frame or successive repetition of same sound in the audio, or psycho-visual redundancy referring to dullness of human vision and sensation to high frequencies.
As a method of compressing this moving picture, a growing interest in H.264 or advanced video coding (AVC) that further improves compression efficiency compared to Moving Picture Experts Group-4 (MPEG-4) has recently been taken. As a scheme for improving the compression efficiency, H.264 employs directional intra-prediction (hereinafter, shortened simply to “intra-prediction”) to eliminate spatial redundancy within a frame.
The intra-prediction refers to a method of coping one sub-block in a designated direction using neighboring pixels in upward and leftward directions, predicting values of current sub-blocks, and encoding only the differences between the copied values and the predicted value of the sub-blocks. By contrast, inter-prediction or temporal prediction refers to a method of performing prediction with reference to the area 40 of a frame 20 having a difference of time, as shown in FIG. 1. This intra-prediction is complementary to the inter-prediction. In other words, one of the two prediction methods that is favorable for the image to be encoded is selectively used.
In the intra-prediction technique complying with the existing H.264 standard, a prediction block is generated from a current block on the basis of another block having a previous encoding sequence. A value subtracting the prediction block from the current block is encoded. In terms of a luminance component, the prediction block is generated in units of 4×4 blocks or 16×16 macroblocks. Nine prediction modes can be selected for each 4×4 block, whereas four prediction modes can be selected for each 16×16 macroblock. A video encoder based on H.264 selects one from among the prediction modes with respect to each block. In the selected prediction mode, a difference between the current block and the prediction block is minimized.
As the prediction mode for the 4×4 block, nine prediction modes including a total of eight modes (modes 0, 1, and 3 through 8) having directionality and a direct current (DC) mode (mode 2) using a mean value of eight neighboring pixels are used in H.264, as shown in FIG. 2.
FIG. 3 shows an example of labeling for explaining the nine prediction models. In this case, a prediction block (areas including a through p) is generated from the current block using samples A through M that are previously decoded. Here, if E, F, G and H cannot be previously decoded, D is copied at their positions, so that E, F, G and H can be virtually generated.
The nine prediction modes will be described in detail with reference to FIG. 4. In the case of mode 0, pixels of the prediction block are estimated by extrapolation in a vertical direction using upper samples A, B, C and D. In the case of mode 1, the pixels of the prediction block are estimated by extrapolation in a horizontal direction using left samples I, J, K and L. Further, in the case of mode 2, the pixels of the prediction block are identically contrapositioned on the average of the upper samples A, B, C and D and the left samples I, J, K and L.
Further, in the case of mode 3, the pixels of the prediction block are estimated by interpolation at an angle of 45° in an upper-right to lower-left (diagonal down-left) direction. In the case of mode 4, the pixels of the prediction block are estimated by extrapolation at an angle of 45° in an upper-left to lower-right (diagonal down-right) direction. Further, in the case of mode 5, the pixels of the prediction block are estimated by extrapolation at an angle of about 26.6° (width/height=½ ) in a vertical-right direction.
In addition, in the case of mode 6, the pixels of the prediction block are estimated by extrapolation at an angle of about 26.6° in a horizontal-down direction. In the case of mode 7, the pixels of the prediction block are estimated by extrapolation at an angle of about 26.6° in a vertical-left direction. Finally, in the case of mode 8, the pixels of the prediction block are estimated by interpolation at an angle of about 26.6° in a horizontal-up direction.
Arrows of FIG. 4 indicate prediction directions in each mode. The samples of the prediction block in modes 3 through 8 can be generated from a weighted average of previously-decoded reference samples A through M. For example, in the case of mode 4, the sample d located at the upper right edge of the prediction block can be estimated as in Equation 1 below. Here, round( ) is the integer round-off function.d=round(B/4+C/2+D/4)  [Equation 1]
Meanwhile, there are four modes 0, 1, 2, and 3 in the 16×16 prediction model for a luminance component. In the case of mode 0, the pixels of the prediction block are estimated from upper samples by extrapolation. In the case of mode 1, the pixels of the prediction block are estimated from left samples by extrapolation. Further, in the case of mode 2, the pixels of the prediction block are calculated by averaging the upper samples and left samples. Finally, in the case of mode 3, the pixels of the prediction block use a linear “plane” function that is fitted to the upper and left samples. This mode works well in areas of smoothly-varying luminance.
In this manner, the video encoder based on the H.264 standard performs intra-prediction using a total of nine modes including eight modes having directionality (hereinafter, referred to as “directional modes”) and one DC mode. Intra mode information representing one of the nine modes obtained in this way is transmitted to a video decoder. The video decoder obtains a prediction block from a current block in the same method as the video encoder on the basis of the intra mode information, and reconstructs the current block from the obtained prediction block.