The process of video compression typically begins with the acquisition of a raw video signal, say when light strikes electronic components of a charge-coupled device (CCD) in a video camera. Conceptually, the camera is obtaining colour-component data for each pixel-position in each picture in a sequence of pictures that makes up the video; the colour components will be values of red, green, and blue if the CCD is based on the classic RGB colour space, or possibly with the addition of a fourth colour component that represents yellow or white light. In practice, various shortcuts may be taken. The CCD may detect only one colour component at each pixel location and extrapolate the missing components based on values from neighbouring pixels. (For example, green values—the most important for human visual perception—may be obtained at 50% of the pixel locations, while red and blue values are each obtained at 25% of the pixel locations.)
Based on the raw video signal, a video encoder makes further changes to the data to create a source video. RGB values are converted to co-ordinates in a colour space that allows the nature of human visual perception to be exploited to achieve greater compression efficiency. The colour components may be luma (an approximation of luminance) samples or chroma (short for “chrominance”) samples. In modern video standards, including High-Efficiency Video Coding (HEVC), the luma component is denoted Y, while the chroma components are denoted Cr and Cb. Beyond this basic conversion, common to all profiles (i.e., sets of available features) of the standard, many different options can be invoked (even within one profile) to select alternative ways to balance two competing goals of video compression: fidelity of the video reconstructed by a video decoder on the one hand and compression efficiency on the other hand. The design decision to choose certain options will be influenced by usage considerations, such as storage size, transmission bandwidth and the computational resources to effectively exploit a particular option.
When invoking various options, the luma data is treated differently from the chroma data, but Cr and Cb data are treated equally. For example, luma data is not down-sampled, but chroma data—of both types—may optionally be down-sampled; in other words, luma samples correspond to pixels on a one-to-one basis, but a chroma (Cr or Cb) sample might correspond to more than one pixel. Luma samples in a source video might be represented at one bit-depth while both Cr samples and Cb samples might be represented at another bit-depth; thus the HEVC standard provides two parameters, BitDepthY for luma (Y) and BitDepthC for chroma samples for both Cr and Cb. It should be noted that the treatment of bit-depth can apply to other colour spaces, including those with additional colour components such as those based upon a supplementary yellow stimulus, or those that incorporate alpha channels. The bit-depth of any such supplementary components may be based on a pre-existing parameter, or be provided in a new parameter.
An encoder will compress a source video comprising samples (said to be in the pixel domain) by, amongst other things, (a) forming a prediction of a set of samples and computing the difference between the prediction and source video samples (b) applying a transform (such as an integer approximation of a discrete cosine transform (DCT)) to generate transformed coefficients (said to be in the transform domain) and (c) quantize those coefficients to generate quantized, transformed coefficients. The coefficients will typically have more bits than the samples from which they were encoded.
Older standards specify and many current devices implement codecs based solely on bit-depths of 8 for both luma and chroma samples, for both encoding and decoding. Increased display resolutions, processor speeds, transmission speeds, and consumers' expectations for ever higher viewing experiences on small have spurred the standardization of profiles, for example in HEVC, that support encoding/decoding of samples having 10-bit or even higher precision. However, devices with limited resources, such as mobile devices, may still have decoders designed to handle only coefficients encoded based on samples having bit-depth 8.
In general, a problem arises when coefficients encoded based on samples of bit-depth D (e.g., 10) are encountered by a decoder designed to handle only samples of bit-depth d, with d<D (e.g., d=8).
similar reference numerals may have been used in different figures to denote similar components.