1. Field of the Invention
The present invention relates to the encoding and decoding of moving picture sequences and is applicable in, for example, a system that uses distributed video coding techniques to distribute moving picture data.
2. Description of the Related Art
Distributed video coding (DVC) is a new coding method, based on the Slepian-Wolf and Wyner-Ziv theorems, that has attracted much recent attention. A basic DVC coding method is described by Aaron et al. in ‘Transform-Domain Wyner-Ziv Codec for Video’, Proc. SPIE Visual Communications and Image Processing, 2004. The encoder treats some frames in a received video sequence as key frames and the rest as Wyner-Ziv frames. The key frames are coded as intraframes. A discrete cosine transform (DCT) is used to transform each Wyner-Ziv frame to the coefficient domain, the coefficients are grouped into bands, the coefficients in the k-th band are quantized by a 2Mk-level quantizer, the quantized coefficients (qk) are expressed in fixed numbers of bits, and the bit planes are extracted and supplied to a Slepian-Wolf encoder that uses a turbo code to produce data bits and error-correcting code bits, generally referred to as parity bits. The data bit are discarded.
The decoder decodes the key frames, uses the decoded key frames to generate a predicted image for each Wyner-Ziv frame, applies a DCT to convert the predicted image to the coefficient domain, groups the coefficients into bands, and inputs the coefficients in each band as side information to a Slepian-Wolf decoder. The Slepian-Wolf decoder uses parity bits received from the encoder to correct prediction errors in the side information by an iterative process, in which the decoder originally receives a subset of the parity bits and may request further parity bits as required. When a satisfactory decoded result is obtained, an inverse discrete cosine transform (IDCT) is applied to reconstruct the image of the Wyner-Ziv frame.
A problem with this method is that feedback from the decoder to the encoder is necessary in order to request additional parity bits. As a result, the encoder and decoder cannot operate independently, and there are inevitable delays involved with requesting and obtaining additional parity bits.
In an alternative scheme, described by Morbee et al. in ‘Improved Pixel-Based Rate Allocation For Pixel-Domain Distributed Video Coders Without Feedback Channel’, ICIVS 2007, the encoder generates a predicted image of its own for each Wyner-Ziv frame, compares this predicted image with the original image in the Wyner-Ziv frame, thereby estimates the number of parity bits that will be required for accurate decoding of the Wyner-Ziv frame, and sends this number of parity bits without having to be asked for them by the decoder. This eliminates the need for a feedback channel and avoids the delays associated with repeated requests.
To estimate the required number of parity bits, the encoder operates on the assumption that the distribution of the differences between the DCT coefficients of the original image and the predicted image can be approximately modeled by a Laplacian distribution. This model is used to estimate the decoder's prediction error probability. A conditional entropy is then calculated from the estimated error probability, and the necessary encoding rate is estimated from the conditional entropy.
Since the Laplacian model is only approximate, and since the encoder and decoder may generate somewhat different predicted images, the estimated necessary encoding rate will occasionally provide fewer parity bits than the decoder actually needs, causing the decoded image to be visibly distorted. To avoid this type of distortion, Morbee et al. have the decoder stop decoding when it reaches a bit plane that it cannot decode correctly, and use only the more significant bit planes to reconstruct the image.
A problem is that when the decoder decides that it cannot decode a particular bit plane, the encoder continues to generate and transmit parity bits for the following less significant bit planes, even though the decoder makes no use of these parity bits. This is a waste of computational resources in the encoder and communication resources on the link between the encoder and decoder.
This waste is a result of the underestimation of the necessary number of parity bits by the encoder. Particularly in a video distribution system, there is a need for an encoder that can estimate the necessary number of parity bits more accurately and generate encoded data of higher quality.