Video streaming over Internet Protocol (IP) networks has enabled a wide range of multimedia applications. Internet video streaming provides real-time delivery and presentation of continuous media content while compensating for the lack of Quality-of-Service (QoS) guarantees over the Internet. Due to the variation and unpredictability of bandwidth and other performance parameters (e.g., packet loss rate) over IP networks, in general, most of the proposed streaming solutions are based on some type of a layered (or scalable) video coding scheme.
FIGS. 1A and 1B illustrate exemplary scalability structures 10A, 10B of one type of scalable video coding scheme known as hybrid temporal-SNR Fine Granular Scalability (FGS HS), as described in detail in earlier mentioned commonly assigned, copending U.S. patent application Ser. No. 09/590,825. Each FGS HS structure 10A, 10B includes a Base Layer 11A, 11B (BL) and an Enhancement Layer 12A, 12B (EL). The BL part of a scalable video stream represents, in general, the minimum amount of data needed for decoding that stream. The EL part of the stream represents additional information, i.e., FGS SNR frames or pictures and FGS temporal frames or pictures (denoted FGST), that enhances the video signal representation when decoded by the receiver. In particular, the additional temporal frames are introduced to obtain a higher frame-rate. The MPEG-4 FGS standard supports both the bi-directional predicted FGST picture type of FIG. 1A and the forward-predicted FGST picture type of FIG. 1B.
FIG. 2 illustrates the functional architecture of an exemplary FGS HS video encoder 100 as described in U.S. patent application Ser. No. 09/590,825. The encoding operation is based on a DCT transform, although other transforms (e.g. wavelet) can also be used. This video encoder 100 is capable of generating the FGS HS structures 10A, 10B of FIGS. 1A and 1B. The video encoder 100 comprises a BL encoder 110 and an EL encoder 130. The video encoder 100 receives an original video signal which is processed into a BL bit stream of I and P frames by the BL encoder 110 and into an EL bit stream of FGS SNR I and P frames and/or P and B FGST frames by the EL encoder 130.
In the FGS HS structures of FIGS. 1A and 1B, the FGST frames are predicted from low-quality base-layer reference frames stored in the frame memory block. Consequently, the resulting motion-compensated residual error is high, thus requiring a large number of bits for compressing these frames. Accordingly, the transition to a higher frame-rate is performed at either low bit-rates or very high bit-rates.
Accordingly, a technique is needed that lowers the bandwidth required for introducing FGST frames in a FGS HS video coding scheme.