1. Field
The invention relates to scalable encoding and decoding of multimedia data that may comprise audio data, video data or both. More particularly, the invention relates to a system and method for scalable encoding and decoding of multimedia data using multiple layers.
2. Background
The International Telecommunication Union (ITU) has promulgated the H.261, H.262, H.263 and H.264 standards for digital video encoding. These standards specify the syntax of encoded digital video data and how this data is to be decoded for presentation or playback. However, these standards permit various different techniques (e.g., algorithms or compression tools) to be used in a flexible manner for transforming the digital video data from an uncompressed format to a compressed or encoded format. Hence, many different digital video data encoders are currently available. These digital video encoders are capable of achieving varying degrees of compression at varying cost and quality levels.
Scalable video coding generates multiple layers, for example a base layer and an enhancement layer, for the encoding of video data. These two layers are generally transmitted on different channels with different transmission characteristics resulting in different packet error rates. The base layer typically has a lower packet error rate when compared with the enhancement layer. The base layer generally contains the most valuable information and the enhancement layer generally offers refinements over the base layer. Most scalable video compression technologies exploit the fact that the human visual system is more forgiving of noise (due to compression) in high frequency regions of the image than the flatter, low frequency regions. Hence, the base layer predominantly contains low frequency information and the enhancement layer predominantly contains high frequency information. When network bandwidth falls short, there is a higher probability of receiving just the base layer of the coded video (no enhancement layer). In such situations, the reconstructed video is blurred and deblocking filters may even accentuate this effect.
Decoders generally decode the base layer or the base layer and the enhancement layer. When decoding the base layer and the enhancement layer, multiple layer decoders generally need increased computational complexity and memory when compared with single layer decoders. Many mobile devices do not utilize multiple layer decoders due to the increased computational complexity and memory requirements.