1. Field of the Invention
Apparatuses and methods consistent with the present invention relate to a multi-layer video coding technique, and more particularly, to predecoding a hybrid bitstream generated by a plurality of coding schemes.
2. Description of the Related Art
Development of information communication technologies including the Internet has led to an increase of video communication. However, consumers have not been satisfied with existing text-based communication schemes. To satisfy the consumers, multimedia data containing a variety of information including text, picture, music and the like has been increasingly provided. Multimedia data is usually voluminous such that it requires a storage medium having a large capacity. Also, a wide bandwidth is required for transmitting the multimedia data. For example, a picture of 24 bit true color having a resolution of 640×480 needs the capacity of 640×480×24 per frame, namely, data of approximately 7.37 Mbits. In this respect, a bandwidth of approximately 1200 Gbits is needed so as to transmit this data at 30 frames/second, and a storage space of approximately 1200 Gbits is needed so as to store a movie having a length of 90 minutes. Taking this into consideration, it is necessary to use a compressed coding scheme in transmitting multimedia data including text, picture or sound.
A basic principle of data compression is to eliminate redundancy between the data. Data redundancy implies three types of redundancies: spatial redundancy, temporal redundancy, and perceptional-visual redundancy. Spatial redundancy refers to duplication of identical colors or objects in an image, temporal redundancy refers to little or no variation between adjacent frames in a moving picture frame or successive repetition of same sounds in audio, and perceptional-visual redundancy refers to dullness of human vision and sensation to high frequencies. By eliminating these redundancies, data can be compressed.
FIG. 1 shows an environment in which video compression is applied. Original video data is compressed by a video encoder 1. Currently known Discrete Cosine Transform (DCT)-based video compression algorithms are MPEG-2, MPEG-4, H.263, and H.264. In recent years, research into wavelet-based scalable video coding has been actively conducted. Compressed video data is sent to a video decoder 3 via a network 2. The video decoder 3 decodes the compressed video data to reconstruct original video data.
The video encoder 1 compresses the original video data to not exceed the available bandwidth of the network 2 in order for the video decoder 3 to decode the compressed data. However, communication bandwidth may vary depending on the type of the network 2. For example, the available communication bandwidth of an Ethernet is different from that of a wireless local area network (WLAN). A cellular communication network may have a very narrow bandwidth. Thus, research is being actively conducted into a method for generating video data compressed at various bit-rates from the same compressed video data, in particular, scalable video coding.
Scalable video coding is a video compression technique that allows video data to provide scalability. Scalability is the ability to generate video sequences at different resolutions, frame rates, and qualities from the same compressed bitstream. Temporal scalability can be provided using Motion Compensation Temporal filtering (MCTF), Unconstrained MCTF (UMCTF), or Successive Temporal Approximation and Referencing (STAR) algorithm. Spatial scalability can be achieved by a wavelet transform algorithm or multi-layer coding that has been actively studied in recent years. Signal-to-Noise Ratio (SNR) scalability can be obtained using Embedded ZeroTrees Wavelet (EZW), Set Partitioning in Hierarchical Trees (SPIHT), Embedded ZeroBlock Coding (EZBC), or Embedded Block Coding with Optimized Truncation (EBCOT).
Multi-layer video coding algorithms have recently been adopted for scalable video coding. While conventional multi-layer video coding usually uses a single video coding algorithm, increasing attention has been recently directed to multi-layer video coding using a plurality of video coding algorithms.
FIGS. 2 and 3 illustrate the structures of bitstreams generated by conventional multi-layer video coding schemes. FIG. 2 illustrates a method of generating and arranging a plurality of Advanced Video Coding (AVC) layers at different resolutions, frame rates, and bit-rates. Of course, each layer is efficiently predicted and compressed using information from another layer. Referring to FIG. 2, multiple AVC layers are encoded at different resolutions of QCIF to SD, different frame rates of 15 Hz to 60 Hz, and different bit-rates of 32 Kbps to 3.0 Mbps, thereby achieving a wide variety of visual qualities. However, the method shown in FIG. 2 may reduce redundancy to some extent through interlayer prediction but suffer an increase in bitstream size because an AVC layer is generated for each visual quality.
FIG. 3 shows an example of a bitstream including an AVC base layer and a wavelet enhancement layer. Here, the wavelet enhancement layer has different resolutions from QCIF to SD because wavelet transform supports decomposition of an original image at various resolutions. The wavelet enhancement layer that is subjected to embedded quantization can also be encoded at bit-rates of 32 Kbps to 3.0 Mbps by arbitrarily truncating a bitstream from the tail. Further, when a hierarchical method such as MCTF is used for temporal transformation, the structure shown in FIG. 3 can provide various frame rates from 15 Hz to 60 Hz. The use of only two layers can achieve various visual qualities but not provide high video coding performance at each visual quality.
FIG. 4 is a graph illustrating Peak Signal-to-Noise Ratio (PSNR) with respect to a bit-rate for AVC and wavelet coding. As evident from FIG. 4, wavelet coding exhibits high performance at high bit-rate or resolution while providing low performance at low bit-rate or resolution. Conversely, AVC provides good performance at a low bit-rate. Thus, the use of a bitstream including two layers for each resolution (hereinafter referred to as an ‘AVC-wavelet hybrid bitstream’) is proposed. That is, an upper layer (‘wavelet layer’) is encoded using wavelet coding at specific resolution while a lower layer (‘AVC layer’) is encoded using AVC. Thus, the AVC layer is used for a low bit-rate while the wavelet layer is used for a high bit-rate. Because the wavelet layer is quantized using embedded quantization, it can be encoded at various bit-rates by randomly truncating a bitstream from the tail. A bit-rate must be suitably allocated to the lower layer, i.e., AVC layer, to ensure a minimum data rate necessary for circumstances. Alternatively, as shown in FIG. 4, a critical bit-rate Bc can be allocated to provide optimum performance of an AVC-wavelet hybrid bitstream.
FIG. 5 illustrates a multi-layer coding method using two different coding algorithms for each resolution. Here, a video encoder uses both an AVC coding algorithm offering excellent coding efficiency and a wavelet coding technique providing excellent scalability. While the bitstream shown in FIG. 3 has only two layers, i.e., wavelet layer and AVC layer, the bitstream shown in FIG. 5 includes complex layers, i.e., a wavelet layer and an AVC layer for each resolution. In this way, the wavelet layer is not used for implementation of resolution scalability but is used for implementation of SNR scalability. To provide temporal scalability, MCTF or UMCTF may be used.
To adjust a bit-rate for an AVC-wavelet hybrid bitstream, texture data in a wavelet layer bitstream containing the texture data and motion data can be truncated from the tail. When there is no more texture data to truncate, the entire motion data should be truncated because the motion data is not scalable. However, it is not desirable to maintain motion data when there is little texture data when implementing SNR scalability. Therefore, there is a need to develop a method for adjusting a SNR scale suitable for an AVC-wavelet hybrid bitstream.