Engineers use compression (also called source coding or source encoding) to reduce the bitrate of digital video. Compression decreases the cost of storing and transmitting video information by converting the information into a lower bitrate form. Decompression (also called decoding) reconstructs a version of the original information from the compressed form. A “codec” is an encoder/decoder system.
Compression can be lossless, in which the quality of the video does not suffer, but decreases in bit rate are limited by the inherent amount of variability (sometimes called source entropy) of the input video data. Or, compression can be lossy, in which the quality of the video suffers, and the lost quality cannot be completely recovered, but achievable decreases in bit rate are more dramatic. Lossy compression is often used in conjunction with lossless compression—lossy compression establishes an approximation of information, and the lossless compression is applied to represent the approximation.
In general, video compression techniques include “intra-picture” compression and “inter-picture” compression. Intra-picture compression techniques compress a picture with reference to information within the picture, and inter-picture compression techniques compress a picture with reference to a preceding and/or following picture (often called a reference or anchor picture) or pictures.
For intra-picture compression, for example, an encoder splits a picture into 8×8 blocks of samples, where a sample is a number that represents the intensity of brightness (or the intensity of a color component) for a small, elementary region of the picture, and the samples of the picture are organized as arrays or planes. The encoder applies a frequency transform to individual blocks. The frequency transform converts a block of samples into a block of transform coefficients. The encoder quantizes the transform coefficients, which may result in lossy compression. For lossless compression, the encoder entropy codes the quantized transform coefficients.
Inter-picture compression techniques often use motion estimation and motion compensation to reduce bit rate by exploiting temporal redundancy in a video sequence. Motion estimation is a process for estimating motion between pictures. For example, for a block of samples or other unit of the current picture, the encoder attempts to find a match of the same size in a search area in another picture, the reference picture. Within the search area, the encoder compares the current unit to various candidates in order to find a candidate that is a good match. When the encoder finds an exact or “close enough” match, the encoder parameterizes the change in position between the current and candidate units as motion data (such as a motion vector). In general, motion compensation is a process of reconstructing pictures from reference pictures using motion data. Typically, the encoder also computes the sample-by-sample difference between the original current unit and its motion-compensated prediction to determine a residual (also called a prediction residual or error signal). The encoder then applies a frequency transform to the residual, resulting in transform coefficients. The encoder quantizes the transform coefficients and entropy codes the quantized transform coefficients.
If an intra-compressed picture or motion-predicted picture is used as a reference picture for subsequent motion compensation, the encoder reconstructs the picture. A decoder also reconstructs pictures during decoding, and it uses some of the reconstructed pictures as reference pictures in motion compensation. For example, for a block of samples of an intra-compressed picture, a decoder reconstructs a block of quantized transform coefficients. The decoder and encoder perform inverse quantization and an inverse frequency transform to produce a reconstructed version of the original block of samples.
Or, for a block encoded using inter-picture prediction, the decoder or encoder reconstructs a block from a prediction residual for the block. The decoder decodes entropy-coded information representing the prediction residual. The decoder/encoder inverse quantizes and inverse frequency transforms the data, resulting in a reconstructed residual. In a separate motion compensation path, the decoder/encoder computes a predicted block using motion vector information for displacement from a reference picture. The decoder/encoder then combines the predicted block with the reconstructed residual to form the reconstructed block.
Over the last two decades, various video codec standards have been adopted, including the H.261, H.262 (MPEG-2) and H.263 standards and the MPEG-1 and MPEG-4 standards. More recently, the H.264 standard (sometimes referred to as 14496-10 or the AVC standard) and VC-1 standard have been adopted. For additional details, see representative versions of the respective standards. A video codec standard typically defines options for the syntax of an encoded video bitstream, detailing parameters in the bitstream for a video sequence when particular features are used in encoding and decoding. In many cases, a video codec standard also provides details about the decoding operations that a decoder should perform to achieve correct results in decoding. Often, however, the low-level implementation details of the operations are not specified, or the decoder is able to vary certain implementation details to improve performance, so long as the correct decoding results are still achieved.
Image and video decoding are computationally intensive. Tasks such as entropy decoding and inverse frequency transforms can require extensive computation, considering how many times the operations are performed during decoding. This computational cost can be problematic in various scenarios, such as decoding of high-quality, high-bit rate video (e.g., compressed high-definition video). In particular, decoding tasks according to more recent standards such as H.264 and VC-1 can be computationally intensive and consume significant memory resources.