The present invention generally relates to video compression, and more particularly to a method for encoding enhancement layer video data in an embedded fashion in order to achieve fine granular scalable video.
Scalable video coding is a desirable feature for many multimedia applications and services that are used in systems employing decoders with a wide range of processing power. Several types of video scalability schemes have been proposed such as temporal, spatial and quality scalability. All of these types consist of a base layer and an enhancement layer. The base layer is the minimum amount of data required to decode the video stream, while the enhancement layer is the additional data required to provide an enhanced video signal.
For each type of video scalability scheme, a particular scalability structure is defined. One type of structure is known as fine granularity scalability (FGS), which has been proposed and will soon become part of the MPEG-4 multimedia standard. The use of FGS primarily targets applications where video is streamed over heterogeneous networks in real time. Further, FGS enables the bandwidth to be adapted by encoding content once for a range of different bit rates, which enables a video transmission server to change the transmission rate dynamically without in depth knowledge of or parsing the video stream.
Currently, there is an implementation of the proposed FGS structure in MPEG-4 as a reference for the core experiment on this standardization activity. This particular implementation uses the current MPEG-4 coding standard as the base layer encoding scheme. The MPEG-4 implementation also encodes the enhancement layer as the difference between the discrete cosine transform (DCT) coefficients of the original picture and the reconstructed base layer DCT coefficients. Further, the enhancement coding scans through the difference (or residual) DCT coefficients bit-plane by bit-plane to encode a series of 1""s and 0""s as a refinement of the base layer DCT coefficients.
One major limitation of the above-described implementation is that the enhancement layer encoder scans each individual bit plane of the residual DCT coefficients from the most significant to the least significant bit, block by block. In other words, for each bit plane, a whole DCT coefficient block is scanned before subsequent blocks are scanned. Thus, this requires coding of one bit-plane of all of the DCT coefficients for the whole image in order to refine the entire picture. Therefore, the enhancement layer bit stream generated by this implementation contains only a limited number of scalability layers.
Embedded or progressive coding of still images was first utilized for wavelet image coding, which was later extended to DCT image coding. Thus, embedded DCT coding algorithms have been proposed in the past. These coding algorithms retained high compression efficiency while achieving high scalability in the resulting bit streams. Therefore, these algorithms may be alternatives for the FGS encoding structure.
The present invention is directed to a method for encoding video data in an embedded fashion in order to achieve fine granular scalable video. The method according to the present invention still scans the DCT coefficients bit-plane by bit-plane. However, the present invention differs in that it incorporates DCT frequency domain scanning besides spatial and bit-plane scanning.
The method according to the present invention includes the video data being transformed into a plurality of DCT coefficients. Further, the DCT coefficients are arranged into sub-groups and the DCT coefficients are scanned according to the sub-groups. The DCT coefficients being scanned by the sub-groups enables a higher level of scalability to be achieved.