The present invention generally relates to video coding, and more particularly to rearranging the transmission order of enhancement layer frames.
In MPEG-4 base-layer decoders as well as MPEG-2 decoders for that matter, the transmission order of the various frames differs from the display order. An example of this is shown in FIG. 1. As can be seen, the transmission order of both the base layer frames and corresponding enhancement layer frames differs from the display order.
The reason for the rearrangement of the frames of FIG. 1 is that the bi-directional motion compensation (MC) employed for the B-frames requires the anchor frames (I and P-frames) on which the prediction is made to be already available in the memory at the encoder/decoder side, when the B-frames are encoded/decoded. This requires that the I- and P-frames to be transmitted to the decoder prior to the B-frames. However, since the B-frames is typically displayed between the I- and P-frames, the transmission and display order of the frames are different due to the MC-prediction.
A block diagram of one example of a scalable (layered) decoder is shown in FIG. 2. During operation, the decoder 2 receives the encoded base and enhancement layer frames in the transmission order shown in FIG. 1. Further, the decoder 2 will decode and reorder these frames into the display order shown in FIG. 2.
As can be seen, the decoder 2 includes two separate paths for decoding the base layer and enhancement layer bit steams. Since these two paths are separate the decoding process of each of the two streams does not need to be synchronized.
The path for the base layer stream includes a variable length decoder 4, an inverse quantization block 6 and an inverse discrete cosine transform block (IDCT) 8 to convert the base layer bit-steam into picture frames. A motion compensation block 12 is also included for performing motion compensation on picture frames previously stored in a frame memory 14 based on the received motion vectors. Further, an adder 10 is also included to combine the outputs of the IDCT block 8 and the motion compensation block 12.
The path for the enhancement layer stream includes a variable length decoder (VLD 15, a bit plane decoding block 17 and another IDCT block 18 to convert the enhancement layer bit-steam into picture frames. During operation, the bit-plane decoding block 17 will decode the output of the variable length decoder 12 into individual bit planes using any suitable fine granular scalable decoding technique.
As can be further seen, a bit plane memory 16 is also included to store the individual bit planes until all of the bit planes for a current frame are decoded. Further, after the IDCT block 18 a frame memory 22 is included. The frame memory 22 is used to compensate for the encoded frames being received in a transmission order different from the display order, as shown in FIG. 1.
For example, if the enhancement layer frames are transmitted at the same time instance as the corresponding base-layer frames, the frame-memory 22 is required to store the enhancement-layer frames until its display time, which coincides with the base-layer display time. Referring back to the transmission order of FIG. 1, the enhancement picture E3 after being decoded is stored in the frame memory 22 until after the enhancement frame E2 is decoded and displayed. Thereafter, the enhancement frame E3 is retrieved from the frame memory and than displayed. Therefore, in this manner, the transmission order of the frames is converted into the display order, as shown in FIG. 1.
The decoder 2 also includes another adder 20 to combine the picture frames from each of the paths in order to produce enhanced video 24. The enhanced video 24 can be either displayed immediately in real time or stored in an output frame memory for display at a later time.
The present invention is directed to a method for encoding video data. The method includes coding a portion of the video data to produce base layer frames. Also, coding another portion of the video data to produce enhancement layer frames. Further, rearranging the enhancement layer frames into a display order.
The present invention is also directed to a method for decoding a video signal including a base layer and an enhancement layer, where the enhancement layer includes enhancement frames arranged in a display order. The method includes decoding the base layer to produce decoded base layer frames. Also, decoding the enhancement layer to produce decoded enhancement layer frames and rearranging the decoded base layer frames into the display order. Further, combining the decoded base layer frames with the decoded enhancement layer frames without storing any of the decoded enhancement layer frames to form video frames.