A combination of a highly non-linear transfer function, 4:2:0 or 4:2:2 subsampling and non-constant luminance ordering gives rise to severe artifacts in saturated colors, i.e. color values close to the edge of the color gamut. An example is described in Annex B, where changes between two colors of similar luminance can result in a reconstructed image or picture with very different luminances.
One way to get around the problem is to not use luma values Y′ and chroma value Cb′ and Cr′ for encoding, but instead some other color representation. However, there are indications that color representations other than Y′Cb′Cr′ do not compress well. As an example, MPEG (Motion Picture Experts Group) tried YdZdX but the compression efficiency was not competitive against Y′Cb′Cr′.
Furthermore, many systems already use Y′Cb′Cr′ or R′G′B′ for the last step of the signal to the display. As an example, the HDMI (High Definition Multimedia Interface) standard has recently adopted the use of Y′Cb′Cr′ 4:2:0 using ST 2084 for transmission of images from the set-top box to the TV as specified in CEA-861.3 [4]. This means that even if the encoding is done in some other color representation, after decoding it still needs to be converted to Y′Cb′Cr′ 4:2:0, which will give rise to artifacts. Doing this conversion correctly can be quite complex when compared to the rest of the decoding chain, whereas doing the same thing in the encoder is not so expensive, relatively speaking; encoding is already so much more complex than decoding. It is therefore better to do a high-quality conversion to Y′Cb′Cr′ already in the encoder. Due to these reasons it is advantageous to be able to use the Y′Cb′Cr′ representation for encoding of HDR (High Dynamic Range) data.
Yet another way to cope with the problem is to simply make sure that the edges of the color gamut are never used. This will however severely limit the kind of colors that can be reproduced, so this is not a good solution.
Another solution to the problem is to use a transfer function with lower steepness, i.e. less non-linear, such as BT.1886. However, the problem with this approach is that many more bits would be required for representing each color component of a pixel in order to avoid banding artifacts. Alternatively one could use the same number of bits but limit the peak brightness.
There is therefore a need for an efficient processing of pixels in a picture of a video sequence that overcomes at least some of the problems mentioned above and that does not have the shortcomings of the above mentioned solutions.