Data compression occurs in a number of contexts. It is very commonly used in communications and computer networking to store, transmit, and reproduce information efficiently. It finds particular application in the encoding of images, audio and video. Video presents a significant challenge to data compression because of the large amount of data required for each video frame and the speed with which encoding and decoding often needs to occur. The current state-of-the-art for video encoding is the ITU-T H.264/AVC video coding standard. It defines a number of different profiles for different applications, including the Main profile, Baseline profile and others. A next-generation video encoding standard is currently under development through a joint initiative of MPEG-ITU: High Efficiency Video Coding (HEVC).
There are a number of standards for encoding/decoding images and videos, including H.264, that use block-based coding processes. In these processes, the image or frame is divided into blocks, typically 4×4 or 8×8, and the blocks are spectrally transformed into coefficients, quantized, and entropy encoded. In many cases, the data being transformed is not the actual pixel data, but is residual data following a prediction operation. Predictions can be intra-frame, i.e. block-to-block within the frame/image, or inter-frame, i.e. between frames (also called motion prediction). It is expected that HEVC will also have these features.
When spectrally transforming residual data, many of these standards prescribe the use of a discrete cosine transform (DCT) or some variant thereon. The resulting DCT coefficients are then quantized using a quantizer that employs a uniform quantization step size.
Quantization is lossy. In other words, it introduces distortion that shows up as noise in the reconstructed images or videos. Accordingly, many existing compression schemes utilize some form of post-processing, i.e. filtering, to try to remove quantization noise from reconstructed pixels. Examples include deblocking filters, de-noising filters, or other pixel-domain filters.
Work in lossy compression, e.g., audio/voice coding, video coding, image coding, etc., tends to focus on improving rate-distortion performance. That is, the objective of most encoding and decoding schemes is to find an optimal balance between distortion and coding rate. A rate-distortion optimization expression of the type J=D+λR is typically used, wherein the Lagrangian multiplier λ represents the desired trade-off between coding rate and distortion.
Similar reference numerals may have been used in different figures to denote similar components.