Recently, the delivery of multimedia information to mobile device over wireless channels and/or Internet is a challenging problem because multimedia transportation suffers from bandwidth fluctuation, random errors, burst errors and packet losses. Thus, the MPEG-4 committee has adopted various techniques to address the issue of error-resilient delivery of video information for multimedia communications. However, it is even more challenging to simultaneously stream or multicast video over Internet or wireless channels to a wide variety of devices where it is impossible to optimize video quality for a particular device, bit-rate and channel conditions. The compressed video information is often lost due to congestion, channel errors and transport jitters. The temporal predictive nature of most compression technology causes the undesirable effect of error propagation.
To address the broadcast or Internet multicast applications, the MPEG-4 committee further develops the FGS profile that provides a scalable approach for streaming video applications. The MPEG-4 FGS representation starts by separating the video frames into two layers with identical spatial resolutions, which are referred to as the base layer and the enhancement layer. The bit-stream at base layer is coded by a non-scalable MPEG-4 advanced simple profile (ASP) while the enhancement layer is obtained by coding the difference between the original DCT (discrete cosine transformation) coefficients and the coarsely quantized coefficients for the base layer in a bit-plane by bit-plane fashion. The FGS enhancement layer can be truncated at any location, which provides fine granularity of reconstructed video quality proportional to the number of bits actually decoded. There is no temporal prediction for the FGS enhancement layer, which provides an inherent robustness for the decoder to recover from any errors. However, the lack of temporal dependency at the FGS enhancement layer decreases the coding efficiency as compared to that of the single layer non-scalable scheme defined by MPEG Video Group.
FIGS. 1a and 1b show the overall FGS encoder and the decoder structure used in MPEG-4. Detailed description of the technique used in FGS can be found in the paper “Overview of Fine Granularity Scalability in MPEG-4 Video Standard” published by W. Li in IEEE Transactions on Circuits and Systems For Video Technology, Vol. 11, No. 3, March 2002. The base layer uses non-scalable coding to reach low bound of the bit-rate range. The enhancement layer codes the difference between the original picture and the reconstructed picture using bit-plane coding of the DCT coefficients.
In FIG. 1a, the functional block labeled “Find Maximum” is to find the maximum number of bit-planes in a frame. The FGS decoder structure shown in FIG. 1a is the one standardized in the amendment of MPEG-4. The bit-stream of the enhancement layer may be truncated into any number of bits per picture after coding is completed. The decoder should be able to reconstruct an enhancement layer video from bit-steams of the base layer and the truncated enhancement layer. The quality of the enhancement layer video is proportional to the number of bits decoded by the decoder for each picture.
To improve the MPEG-4 FGS framework, a motion compensation based FGS (MC-FGS) technique with high quality reference frame was disclosed to remove the temporal redundancy for both the base and enhancement layers. The advantage of the conventional MC-FGS is that it can achieve high compression efficiency close to that of the non-scalable approach in an error-free transport environment. However, the MC-FGS technique suffers from the disadvantage of error propagation or drift when part of the enhancement layer is corrupted or lost.
Similarly, another conventional PFGS (progressive fine granularity scalable) technique improves the coding efficiency of FGS and provides means to alleviate the error drift problems simultaneously. To remove the temporal redundancy, this PFGS adopts a separate prediction loop that contains a high quality reference frame where a partial temporal dependency is used to encode the enhancement layer video. Thus, the PFGS technique trades coding efficiency for certain level of error robustness. In order to address the drift problem, the PFGS technique keeps a prediction path from the base layer to the highest bit-planes at the enhancement layer across several frames to make sure that the coding schemes can gracefully recover from errors over a few frames. The PFGS technique suffers from loss of coding efficiency whenever a lower quality reference frame is used. Such disadvantageous situation occurs when only a limited number of bit-planes are used or a reset of the reference frame is invoked.