High Dynamic Range (HDR) video and Wide Color Gamut (WCG) video offer greater ranges of luminance and color values than traditional video. For example, traditional video can have a limited luminance and color range, such that details in shadows or highlights can be lost when images are captured, encoded, and/or displayed. In contrast, HDR and/or WCG video can capture a broader range of luminance and color information, allowing the video to appear more natural and closer to real life to the human eye.
However, many common video encoding and decoding schemes, such as MPEG-4 Advanced Video Coding (AVC) and High Efficiency Video Coding (HEVC), are not designed to directly handle HDR or WCG video. As such, HDR and WCG video information normally must be converted into other formats before it can be encoded using a video compression algorithm.
For example, HDR video formats such as the EXR file format describe colors in the RGB color space with 16-bit values to cover a broad range of potential HDR values, while 8 or 10-bit values are often used to express the colors of non-HDR video. Since many video compression algorithms expect 8 or 10-bit values, 16-bit HDR color values can be quantized into 10-bit values that the compression algorithms can work with.
Some encoders use a coding transfer function to convert linear values from the input video into non-linear values prior to uniform quantization. By way of a non-limiting example, coding transfer functions are often gamma correction functions. However, even when an encoder uses a coding transfer function to convert linear input values into non-linear values, the coding transfer function is generally fixed, such that it does not change dependent on the content of the input video. For example, an encoder's coding transfer function can be defined to statically map every possible input value in an HDR range, such as from 0 to 10,000 nits, to specific non-linear values. However, when the input video contains input values in only a portion of that range, fixed mapping can lead to poor allocation of quantization levels. For example, a picture primarily showing a blue sky can have a lot of similar shades of blue, but those blues can occupy a small section of the overall range for which the coding transfer function is defined. As such, similar blues can be quantized into the same value. This quantization can often be perceived by viewers as contouring or banding, when quantized shades of blue extends in bands across the sky displayed on their screen instead of a more natural transitions between the colors.
Additionally, psychophysical studies of the human visual system have shown that a viewer's sensitivity to contrast levels at a particular location can be more dependent on the average brightness of surrounding locations than the actual levels at the location itself. However, most coding transfer functions do not take this into account and instead have fixed conversion functions or tables that do not take characteristics of the actual content, such as its average brightness, into account.
What is needed is a method of adapting the coding transfer function, or otherwise converting and/or redistributing input values, based on the actual content of the input video. This can generate a curve of non-linear values that represents the color and/or intensity information actually present in the input video instead of across a full range of potential values. As such, when the non-linear values are uniformly quantized, the noise and/or distortion introduced by uniform quantization can be minimized such that it is unlikely to be perceived by a human viewer. Additionally, what is needed is a method of transmitting information about the perceptual mapping operations used by the encoder to decoders, such that the decoders can perform corresponding reverse perceptual mapping operations when decoding the video.