Various products, such as digital cameras and digital video cameras, are used to capture images and video. These products contain an image sensing device, such as a charge coupled device (CCD), which is used to capture light energy focussed on the image sensing device. The captured light energy, which is indicative of a scene, is then processed to form a digital image. Various formats are used to represent such digital images, or videos. Formats used to represent video include Motion JPEG, MPEG2, MPEG4 and H.264.
All the formats listed above are compression formats. While those formats offer high quality and improve the number of video frames that can be stored on a given media, they typically suffer because of their long encoding runtime.
A complex encoder requires complex hardware. Complex encoding hardware in turn is disadvantageous in terms of design cost, manufacturing cost and physical size of the encoding hardware. Furthermore, long encoding runtime delays the rate at which video frames can be captured while not overflowing a temporary buffer. Additionally, more complex encoding hardware has higher battery consumption. As battery life is essential for a mobile device, it is desirable that battery consumption be minimized in mobile devices.
To minimize the complexity of an encoder, Wyner Ziv coding, or “distributed video coding”, may be used. In distributed video coding the complexity of the encoder is shifted to the decoder. The input video stream is also usually split into key frames and non-key frames. The key frames are compressed using a conventional coding scheme, such as Motion JPEG, MPEG2, MPEG4 or H.264, and the decoder conventionally decodes the key frames. With the help of the key frames the non-key frames are predicted. The processing at the decoder is thus equivalent to carrying out motion estimation which is usually performed at the encoder. The predicted non-key frames are improved in terms of visual quality with the information the encoder is providing for the non-key frames.
The visual quality of the decoded video stream depends heavily on the quality of the prediction of the non-key frames and the level of quantization to the image pixel values. The prediction is often a rough estimate of the original frame, generated from adjacent frames, e.g., through motion estimation and interpolation. Thus when there is a mismatch between the prediction and the decoded values, some forms of compromise are required to resolve the differences.
Distributed video coding may be used to correct both prediction errors and error correction mistakes. Conventionally, a frame re-construction function after Wyner-Ziv decoding has been used to correct such errors. If the predicted value is within a range of the decoded quantized symbol, the reconstructed pixel value is made equal to the predicted value. Otherwise the re-construction value is set to equal the upper bound or the lower bound of the quantized symbol, depending on the magnitude of the predicted value. Such a method minimizes decoding errors and eliminates large positive or negative errors which are highly perceptible to human eyes. However, the minimization is considered sub-optimal.