As depicted on FIG. 1, scalable video decoding consists in decoding (respectively encoding) a Base Layer (BL) bitstream and at least one Enhancement Layer (EL) bitstream. Usually, EL pictures are predicted from (possibly upsampled) decoded BL pictures. However, when the EL pictures and the BL pictures are represented with different color spaces and/or have been color graded differently, the prediction is less efficient. In order to improve the prediction, it is known to apply a color transform on the decoded BL pictures. More precisely, the color transform maps the colors of the BL color space (first color space) on the colors of the EL color space (second color space) using color information.
As depicted on FIG. 2, in video content distribution, a color transform is usually applied on the decoded pictures so that the transformed decoded pictures are adapted to the end device rendering capability.
This color transform is also known as Color Mapping Function (CMF). The CMF is for example approximated by a 3×3 gain matrix plus an offset (Gain-Offset model). In this case, the CMF is defined by 12 parameters. However, such an approximation of the CMF is not very precise because it assumes a linear transform model. Consequently, 3D Look Up Table (also known as 3D LUT) is used to describe such a CMF, without any a priori on the CMF model. The 3D LUT is much more precise because its size can be increased depending on the required accuracy. However, the 3D LUT may thus represent a huge data set. Transmitting a 3D LUT to a receiver thus requires encoding of the LUT.
A LUT approximating a CMF associates with at least one color value in the first color space another color value in the second color space. A LUT allows for partitioning the first color space into a set of regions delimited by the vertices of the LUT. Exemplarily, a 3D LUT associates with a triplet of color values in the first color space a set of color values. The set of color values can be a triplet of color values in the second color space or a set of color values representative of the color transform (e.g. locally defined CMF parameters) used to transform color values in the first color space into color values in the second color space.
On FIG. 3, a square 3D LUT is represented as a lattice of N×N×N vertices. For each vertex V(c1,c2,c3) of the 3D LUT, a corresponding triplet of color values (Vc1, Vc2, Vc3) needs to be stored. The amount of data associated with the 3D LUT is N×N×N×K, where K is the amount of bits used to store one LUT triplet value. The triplet value is for example a (R, G, B) triplet, a (Y, U, V) triplet or a (Y, Cb, Cr) triplet, etc. Encoding all the vertex values is not efficient since it represents a huge amount of data.