1. Field of the Invention
The present invention relates to Multiple Description Coding (MDC) techniques.
Multiple Description Coding pursues the main goal of creating several independent bitstreams using an existing coder/decoder (codec), e.g., an existing video codec. Bitstreams can be decoded independently or jointly. The more the bitstreams decoded, the higher the quality of the output signal. Multiple Description (MD) generally includes a pre-processing stage before the encoder in order to split the input sequences (hereinafter, video sequences will be primarily referred to) and control redundancy among subsequences. It also includes a post-processing stage after the decoder, in order to merge the received and successfully decoded substreams. Multiple Description Coding greatly improves error resilience, because each bitstream can be decoded independently. Also, variable bandwidth/throughput can be managed by transmitting a suitable number of descriptions. However, coding efficiency is somewhat reduced depending on the amount of redundancy left among subsequences.
2. Description of the Related Art
FIG. 1 of the annexed drawings is generally representative of MD coding transmission when applied e.g. to image/video signals.
An input image/video signal I is subject to pre-processing by a pre-processor 10 to generate therefrom e.g. four descriptions D1 to D4. These are then passed onto an encoder 20 (of any known type) to be then “transmitted” over a channel C. This may be comprised of a transmission channel proper (e.g. for broadcast transmission) or a recording medium (e.g. tape, disc, digital memory, and so on) over which the encoded signals are written and subsequently read, also at different locations. The signals from the channel C are fed to a decoder 30 to recover multiple received descriptions D1′ to D4′ that are then merged in a post-processing stage 40 to recover an output image/video signal O.
Multiple Description Coding is essentially analogous to Scalable Coding (also known as Layered Coding). The main difference lies in the dependency among bitstreams. The simplest case is that of two bitstreams being created. In the case of scalable coding they are referred to as “base layer” and “enhancement layer”, the latter depends on the former and cannot be decoded independently. On the other hand, in the case of multiple description coding, each description can be individually decoded to get a base quality video.
As is the case for Scalable Coding, there can be Spatial, Temporal or SNR Multiple Descriptions.
Replicated headers/syntax and replicated motion vectors among bitstreams greatly impede coding efficiency in SNR MD. Replicated headers/syntax also hinders Temporal MD, and motion compensation is less effective because of the increased temporal distance between frames. Spatial MD is hindered by headers/syntax as well. Unlike the case of Temporal MD, motion compensation is not affected, particularly when 8×8 blocks are split into smaller blocks, as in the latest H.264 codec. Because of this, Spatial MD Coding is the best choice for video coding.
The underlying video codec can be either one of the traditional solutions based on DCT transform and motion compensation (e.g. MPEG-x, H.26x), or one of the more recent codecs based on the wavelet 3D transform (e.g. SPHIT). The H.264 codec is promising because of its increased coding efficiency, which helps in compensating for the losses due to replicated headers/syntax overhead. The multimode prediction (up to 4 motion vectors per 8×8 block) is expected to assist with Spatial MD.
Multiple Description Coding has been the subject of extensive literature as witnessed, e.g., by the publications listed in the following:    P. C. Cosman, R. M. Gray, M. Vetterli, “Vector Quantization of Image Subbands: a Survey”, September 1995;    Robert Swann, “MPEG-2 Video Coding over Noisy Channels”, Signal Processing and Communication Lab, University of Cambridge, March 1998;    Robert M. Gray “Quantization”, IEEE Transactions on Information Theory, vol. 44, n. 6, Oct. 1998, pp. 2325-2383;    Vivek K. Goyal, “Beyond Traditional Transform Coding”, University of California, Berkeley, Fall 1998;    Jelena Kova{hacek over (c)}ević, Vivek K. Goyal, “Multiple Descriptions—Source-Channel Coding Methods for Communications”, Bell Labs, Innovation for Lucent Technologies, 1998;    Jelena Kova{hacek over (c)}ević, Vivek K. Goyal, Ramon Arean, Martin Vetterli, “Multiple Description Transform Coding of Images”, Proceedings of IEEE Conf. on Image Proc., Chicago, October 1998;    Sergio Daniel Servetto, “Compression and Reliable Transmission of Digital Image and Video Signals”, University of Illinois at Urbana-Champaign, 1999;    Benjamin W. Wah, Xiao Su, Dong Lin, “A Survey Of Error-Concealment Schemes For Real-Time Audio And Video Transmission Over Internet”, Proceedings of IEEE International Symposium on Multimedia Software Engineering, December 2000;    John Apostolopoulos, Susie Wee, “Unbalanced Multiple Description Video Communication using Path Diversity”, IEEE International Conference on Image Processing (ICIP), Thessaloniki, Greece, October 2001;    John Apostolopoulos, Wai-Tian Tan, Suise Wee, Gregory W. Wornell, “Modeling Path Diversity for Multiple Description Video Communication”, ICASSP, May 2002;    John Apostolopoulos, Tina Wong, Wai-Tian Tan, Susie Wee, “On Multiple Description Streaming with Content Delivery Networks”, HP Labs, Palo Alto, February 2002, pp. 1 to 10;    John Apostolopoulos, Wai-Tian Tan, Susie J. Wee, “Video Streaming: Concepts, Algorithms and Systems”, HP Labs, Palo Alto, September 2002;    Rohit Puri, Kang-Won Lee, Kannan Ramchandran and Vaduvur Bharghavan. Forward Error Correction (FEC) Codes Based Multiple Description Coding for Internet Video Streaming and Multicast. Signal Processing: Image Communication, Vol. 16, No. 8, pp. 745-762, May 2001;    Rohit Puri and Kannan Ramchandran. Multiple Description Source Coding Through Forward Error Correction Codes In the Proceedings of the 33rd Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, Calif., October 1999; and    Rohit Puri, Kang-Won Lee, Kannan Ramchandran and Vaduvur Bharghavan. Application of FEC based Multiple Description Coding to Internet Video Streaming and Multicast. Proceedings of the Packet Video 2000 Workshop, Forte Village Resort, Sardinia, Italy, May 2000.
Turning specifically to the patent literature, one may refer, by way of general background to documents WO-A-2004/057876, WO-A-2004/046879, WO-A-2004/047425, WO-A-2004/014083 and, as it more specifically regards the topics considered in the following, to documents WO-A-2003/005676, WO-A-2003/005677, WO-A-2003/0005761, WO-A-2004/032517, and WO-A-2004/056121.
To sum up, the literature referred to in the foregoing discloses a wide gamut of coding schemes: overlapping quantization (MDSQ or MDVQ), correlated predictors, overlapped orthogonal transforms, correlating linear transforms (MDTC, e.g. PCT or pairwise correlating transform for 2 MD), correlating filter banks, interleaved spatial-temporal sampling (e.g. video redundancy coding in H.263/H.263+), spatial-temporal polyphase downsampling (PDMD, see below), domain based partitioning (in the signal domain or in a transform domain), FEC based MDC (e.g. using Reed-Solomon codes).
A simple scheme for SNR (Signal-to-Noise Ratio) MD is coding of independent video flows created by means of MD quantizers, either scalar or vector (MDSQ, MDVQ). The structure of the MD quantizer controls redundancy.
A simple scheme for Spatial/Temporal MD is coding of independent video flows created by means of Spatial or Temporal Polyphase Downsampling (PDMD). A programmable Spatial or Temporal low-pass filter controls redundancy.
As an example, Temporal MD can be achieved by separating odd and even frames, creating 2 subsequences. Alternatively odd and even fields can be separated. Spatial MD is achieved by separating pixels of 2×1 blocks, so that 2 subsequences are created. Alternatively, four subsequences can be created by separating pixels in 2×2 blocks. The two techniques can be combined. Please note, unlike Temporal MD, Spatial MD employs careful processing to avoid color artifacts caused by downsampled chroma formats and field interlacing. Each subsequence is then fed into a standard video encoder.
Another area of interest in encoding/decoding digital signals that will be referred to in the following is represented by error concealment techniques. These are again the subject matter of extensive literature, related to both audio and/or video signals as witnessed e.g. by WO-A-97/015888, WO-A-2003/061284, WO-A-2003/019939, WO-A-2003/017255, WO-A-2003/017555, WO-A-2002/033694, WO-A-2001/095512, WO-A-2001/089228, WO-A-2000/027129.