1. Field of the Invention
Apparatuses and methods consistent with the present invention relate to a video compression technique and, more particularly, to a method and an apparatus for enhancing the performance of entropy coding in a multilayer-based codec.
2. Description of the Related Art
As information communication technology including the Internet develops, image communication increases. The existing text-based communication method does not satisfy consumer desire, so multimedia services, which may satisfy these desires, such as text, image, music and others, are increasing. Multimedia data requires mass storage media because of its large size, and a wide bandwidth for transmission. Therefore, a compression coding technique is essential for multimedia data transmission.
A basic principle of data compression is to remove redundancy. Data may be compressed by removing spatial redundancy such as a repetition of colors or objects, temporal redundancy such as the repetition of adjacent frames in a moving picture or a repetition of sounds in an audio file, or psychovisual redundancy considering the fact that the visual and perceptive abilities of human beings are insensitive to high frequencies. In a general video coding method, temporal redundancy is removed by temporal filtering based on motion compensation, and spatial redundancy is removed by a spatial transformation.
The result of removing redundancy is data loss. In the final step, a quantized result is coded without loss through entropy coding.
Currently, according to a draft of a scalable video coding (hereinafter called SVC) specification that is in progress in Joint Video Team (JVT), which is a group of video professionals of the International Organization for Standardization/International Electrotechnical Commission (ISO/IEC) and International Telecommunication Union (ITU), a multilayer-based coding technology based on H.264 has been proposed.
Entropy coding technologies currently used in the H.264 standard include Context-Adaptive Variable Length Coding (CAVLC), Context-Adaptive Binary Arithmetic Coding (CABAC), and others.
Table 1 shows the parameters that are encoded in each entropy coding technique in the H.264 standard.
TABLE 1Parameter ValuesPARAMETERentropy_coding_mode =entropy_coding_mode =TO BE CODED01MACROBLOCKExp_GolombCABACTYPEMACROBLOCKPATTERNQUANTIZATIONPARAMETERREFERENCEFRAME INDEXMOTION VECTORRESIDUAL DATACAVLC
According to Table 1, if the entropy_coding_mode flag is 0, a macroblock type showing whether the macroblock is an inter-prediction mode or an intra-prediction mode, a macroblock pattern showing a form of a subblock that constitutes a macroblock, a quantization parameter, an index determining a quantization step, a reference frame index showing a number of a frame referred to in an inter-prediction mode, and a motion vector are coded by Exp_Golomb. And residual data showing a difference between an original image and a prediction image is coded by CAVLC.
On the other hand, if the entropy_coding_mode flag is 1, all the parameters will be coded by CABAC.
CABAC shows good performance with parameters having high complexity. Therefore, entropy coding based on Variable Length Coding (VLC), such as CAVLC, is set as a basic profile.
“Variable length code for SVC” (JVT-P056, Poznan, 16th JVT meeting; hereinafter, called JVT-P056), a document submitted by J. Ridge and M. Karczewicz in the 16th JVT meeting, presents a CAVLC technique that considers characteristics of SVC. JVT-P056 follows the same procedure as the existing H.264 in a discrete layer, but uses a VLC technique according to separate statistical characteristics in a Fine Granular Scalable (FGS) layer.
Currently, in the Joint Scalable Video Model (JSVM), three scanning passes are supported for FGS encoding. The three passes are a significant pass, a refinement pass and a remainder pass. For each scanning pass, different methods are applied according to statistical characteristics. For example, in a refinement pass, one VLC table, which is acquired based on the fact that zero (“0”) is preferred in entropy coding, is used.
The JVT-P056 presents a VLC technique for an FGS layer. The technique uses an existing CAVLC technique in a discrete layer, but uses a separate technique using statistical characteristics of the FGS layer.
JVT-P056 suggests a technique for the significant pass as follows. A codeword is characterized by “m”, a cut-off parameter. If “C”, a symbol to be coded, does not exceed m, the symbol is encoded using Exp_Golomb code. If the symbol C is bigger than m, it is divided into the two parts of a length and a suffix according to Formula 1, and is encoded.
                    P        =                              ⌊                                          C                -                m                            3                        ⌋                    +          m                                    (        1        )            
“P” is an encoded codeword, and consists of a length and a suffix (00, 10, or 10).
Because there is a high possibility that zero (“0”) will be generated in the refinement pass, JVT-P056 presents a way to allocate a codeword having a different length by using one VLC table based on the number of zeros which are included in each refinement bit group. The refinement bit group contains refinement bits in groups of a predetermined number. For example, four refinement bits may be considered as one refinement bit group.
Because the current SVC draft has extended a single layer coding algorithm of the conventional H.264 to a multilayer coding algorithm, there is some overhead in the coding of each layer.
However, in the current SVC draft, like the conventional H.264, a method that refers to characteristics of surrounding blocks of the same layer is used. But in the case of a video consisting of multilayers, characteristics of blocks of lower layers corresponding to the blocks of the current layer may additionally be used. Therefore, a method using characteristics of lower layers at the time of SVC-based entropy coding needs to be designed.
In JVT-P056, a technique using one fixed VLC table when coding refinement bits in the refinement pass is presented. But, when considering that there are different zero (“0”) distributions for different frames, slices, macroblocks, or transformation blocks (blocks generated after the discrete cosine transformation (DCT)), one VLC table is not enough.
FIG. 1 shows the rate of bits that are not zeros in FGS layers in a case where a single VLC table is used in a refinement pass. Referring to FIG. 1, as the number of FGS layers increases, the number of non-zeros among blocks to be coded increases to a maximum of 15% (zeros decrease by a maximum 15%). Hence, even though a single VLC table, which is assumed to have a high number of zeross, is efficiently used in the first FGS layer, it is not guaranteed to be efficient in the upper FGS layer. Rather, it would be more efficient to use different VLC tables for each FGS layer. Therefore, there is a need for a technique that uses different tables for each FGS layer in a refinement pass.