A video codec comprises an encoder that transforms an input video sequence into a compressed representation suited for storage/transmission and a decoder that can uncompress the compressed video representation back into a viewable form. Typically the encoder discards some information in the original input video sequence in order to represent the video in a more compact form (for example at a lower bit rate).
A hybrid video codec, for example ITU-T H.263 and H.264, encodes the video information in two phases. First, pixel values in a certain picture area, e.g. a block, are predicted for example by motion compensation means, e.g. finding and indicating an area in one of the previously coded video frames that corresponds closely to the block being coded, or by spatial means, e.g. using the already coded pixel values around the block to be coded in a specified manner. Second, the prediction error, e.g. the difference between the predicted block of pixels and the original block of pixels, is coded. This is typically done by transforming the difference in pixel values using a specified transform, e.g. Discrete Cosine Transform (DCT) or a variant of it, quantizing the DCT coefficients and entropy coding the quantized coefficients. By varying the fidelity of the quantization process, the encoder can control the balance between the accuracy of the pixel representation, e.g. picture quality, and size of the resulting coded video representation, e.g. file size or transmission bit rate.
The decoder reconstructs an output video sequence by applying prediction means similar to the encoder. A predicted representation of a given pixel block, in a current frame to be reconstructed, is formed by the decoder using the motion or spatial information coded in the compressed representation and pixel values from image blocks that were decoded prior to the given pixel block. The decoder also recovers prediction error by applying entropy decoding, dequantization and inverse transform to the DCT coefficients coded in the compressed representation. After applying prediction and prediction error decoding the decoder sums up the prediction and prediction error signals (pixel values) to form the output video frame. The decoder, and encoder, can also apply additional filtering to improve the quality of the output video before passing it for display and/or storage as prediction reference for the subsequent frames in the video sequence.