Engineers use compression (also called source coding or source encoding) to reduce the bit rate of digital video. Compression decreases the cost of storing and transmitting video information by converting the information into a lower bit rate form. Decompression (also called decoding) reconstructs a version of the original information from the compressed form. A “codec” is an encoder/decoder system.
Over the last two decades, various video codec standards have been adopted, including the ITU-T H.261, H.262 (MPEG-2 or ISO/IEC 13818-2), H.263 and H.264 (MPEG-4 AVC or ISO/IEC 14496-10) standards, the MPEG-1 (ISO/IEC 11172-2) and MPEG-4 Visual (ISO/IEC 14496-2) standards, and the SMPTE 421M (VC-1) standard. More recently, the H.265/HEVC standard (ITU-T H.265 or ISO/IEC 23008-2) has been approved. Extensions to the H.265/HEVC standard (e.g., for scalable video coding/decoding, for coding/decoding of video with higher fidelity in terms of sample bit depth or chroma sampling rate, for screen capture content, or for multi-view coding/decoding) are currently under development. A video codec standard typically defines options for the syntax of an encoded video bitstream, detailing parameters in the bitstream when particular features are used in encoding and decoding. In many cases, a video codec standard also provides details about the decoding operations a decoder should perform to achieve conforming results in decoding. Aside from codec standards, various proprietary codec formats define other options for the syntax of an encoded video bitstream and corresponding decoding operations.
A video source such as a camera, animation output, screen capture module, etc. typically provides video in a particular color space, with color components of the video sub-sampled according to a particular color sampling rate, and with sample values having a particular bit depth. In general, a color space (sometimes called a color model) is a model for representing colors as n values per physical position, for n≥1, where each of the n values provides a color component value for that position. For example, in a YUV color space, a luma (or Y) component value represents an approximate brightness at a position and multiple chroma (or U and V) component values represent color differences at the position. Or, in an RGB color space, a red (R) component value represents a red intensity, a green (G) component value represents a green intensity, and a blue (B) component value represents a blue intensity at a position. Historically, different color spaces have advantages for different applications such as display, printing, broadcasting and encoding/decoding. Sample values can be converted between color spaces using color space transformation operations.
Color sampling rate (sometimes called chroma sampling rate) refers to the relative spatial resolution between color components. For example, for a color sampling rate of 4:4:4, information for secondary components (e.g., U and V components for YUV) has the same spatial resolution as information for a primary component (e.g., Y component for YUV). For a color sampling rate of 4:2:2 or 4:2:0, information for secondary components is downsampled relative to information for the primary component. YUV 4:2:0 format is commonly used for encoding/decoding. As a design principle, the decision to use a YUV 4:2:0 format for encoding/decoding is premised on the understanding that, for most use cases, viewers do not notice many visual differences between video encoded/decoded in a YUV 4:2:0 format and video encoded/decoded in a YUV 4:4:4 format. The compression advantages for the YUV 4:2:0 format, which has fewer samples per frame, are therefore compelling.
Bit depth refers to the number of bits per sample value. Common bit depths are 8 bits per sample, 10 bits per sample and 12 bits per sample. In general, having more bits per sample allows for more precise gradations of colors for video, but uses more storage for the video. Having fewer bits per sample typically reduces bit rate at the cost of reduced quality.
Many commercially available video encoders and decoders support only a YUV 4:2:0 format. Other commercially available encoders and decoders (e.g., for the H.264/AVC standard or H.265/HEVC standard) allow an encoder to specify a color space, color sampling rate and bit depth for a given sequence. The specified color space, color sampling rate and bit depth are used for the entire video sequence. These approaches do not provide sufficient flexibility for a general-purpose codec system that may process very different kinds of video content within a single video sequence.