1. Field of the Invention
The present application generally relates to an encoder rate-estimation method and system which does not require any decoder feedback and, more particularly, to Slepian-Wolf code rate estimation in a Wyner-Ziv encoder which allows for correct decoding of compressed data.
2. Background Description
H.264 is a standard for video compression. It is also known as MPEG-4 Part 10, or MPEG-4 AVC (for Advanced Video Coding). Conventional video compression systems such as the H.26*, MPEG* standards are based on the use of differential predictive encoding. This involves the encoder generating a good temporal predictor for the current video frame by the process of block-based motion estimation. The difference between the current video frame and the predictor frame is lossily compressed using transform coding, quantization and entropy coding. The compressed motion and difference information constitute the encoded bitstream. The decoder reconstructs the video frame by decoding the motion and difference information, and using a process of motion compensation.
While motion estimation provides efficient compression, it has a high computational cost; it usually takes a half or more of the total encoding computation complexity, depending on the motion estimation scheme used. Thus, in conventional video codecs, the encoder tends to be computationally heavy while the decoder tends to be computationally light. In many emerging applications, however, there is a need for low-complexity encoders, while computationally heavy decoders are allowable. Such applications include video surveillance, mobile multimedia and mobile video conferencing, battle field video communications, etc.
Wyner-Ziv video coding is a new video compression paradigm which has the potential of reversing the traditional distribution of complexity in video codecs (coder/decoders); in other words, using Wyner-Ziv coding it is possible to design a video codec wherein the encoder is computationally inexpensive, while the decoder is computationally heavy. The Wyner-Ziv and Slepian-Wolf theorems in information theory address the problem of compressing a source with a correlated random variable (termed the “side-information”) available only at the decoder. The theorems show that it is possible to achieve efficient compression in such a case (though the proofs are non-constructive).
The video compression problem can be formulated as a problem of source coding with side-information by treating the current video frame as the source, and predictor video frames as side-information. The encoder does not need to perform motion estimation, and is hence computationally light. Instead, to compress the current video frame, the encoder performs transform coding and quantization of the frame itself, and then passes the quantized coefficients as inputs to a Slepian-Wolf code. The output of the Slepian-Wolf code (as well as certain statistical information) serves as the compressed representation of the frame. The decoder generates side-information for the current frame from previously decoded frames, and uses Slepian-Wolf decoding to reconstruct it. In practice, hybrid Wyner-Ziv coding is often used, wherein every nth video frame is encoded using differential prediction, while all other frames are encoded using Wyner-Ziv coding.
A critical problem in designing an efficient Wyner-Ziv codec is the problem of rate estimation at the encoder. This refers to correctly estimating the rate to be used for Slepian-Wolf coding of the quantized transform coefficients of the Wyner-Ziv coded frames. The reason that the accuracy of rate estimation is critical to the performance of the overall Wyner-Ziv video compression system is as follows. If the Slepian-Wolf coding rate is too low, the Slepian-Wolf decoding fails, and the decoder reconstruction will be erroneous (and will typically have very high distortion). On the other hand, if a large Slepian-Wolf coding rate is used, compression efficiency is sacrificed. Since the encoder is constrained to have low complexity, it is imperative that the computational cost or rate estimation be low. The key challenges in performing accurate, low-complexity rate estimation are several; these include forming good source-side-information channel model estimates with low computational cost, and estimating the Slepian-Wolf code rate correctly for non-ideal, finite-length Slepian-Wolf codes.
Rate estimation methods in prior hybrid Wyner-Ziv coding systems can be classified into the following categories. The first class of methods assumes that the encoder has knowledge of the joint statistics of the source and the side-information, because the source and the side-information derive from ideal random processes. Examples of this class of solutions include the methods described in U.S. Patent Application Publication US20060297690A1 of Liu et al. for “Data Encoding and Decoding Using Slepian-Wolf Coded Nested Quantization to Achieve Wyner-Ziv Coding” and Y. Yang et al., “Wyner-Ziv coding based on TCQ and LDPC codes”, Proc. 37th Asilomar Conference on Signals, Systems, and Computers, Pacific Grove, Calif., November 2003. The main shortcoming of these approaches in the context of the current problem is that the assumption of ideality of statistics does not hold when dealing with real-world video sources.
The second class of methods utilizes decoder feedback in order to determine the current Slepian-Wolf coding rate. Specifically, if the decoder fails to decode the current frame given the Slepian-Wolf code bits it has received, it requests the encoder for more code bits. Examples of this include the methods described in A. Aaron et al., “Transform domain Wyner-Ziv codec for video”, Proc. SPIE Visual Communications and Image Processing, San Jose, Calif. January 2004, and J. Ascenso et al., “Motion compensated refinement for low complexity pixel based distributed video coding”, Proc. Advanced Video and Signal Based Surveillance, 2005. The main shortcoming of these approaches is that the instantaneous decoder feedback these require is impractical for most video communications applications due to practical network delay constraints, and the absence of feedback links in applications like video surveillance.
The third class of methods utilizes block-based classifiers learned off-line in order to select the Slepian-Wolf coding rate for individual frame blocks. Examples of this class are the methods described in WO2005/043882A2 for “Video source coding with side information” and U.S. Patent Application Publication US2004/0194008A1 of Garudardi et al. for “Method, Apparatus, and System for Encoding and Decoding Side Information for Multimedia Transmission.” The main shortcomings of this approach is that block-based off-line classification is inaccurate and, by necessity, restricted to a small set of discrete rates from which the correct rate is to be selected. This leads to inefficient compression performance.