High Dynamic Range (HDR) with Wide Color Gamut (WCG) has become an increasingly hot topic within the TV and multimedia industry in the last couple of years. While screens capable of displaying the HDR video signal is emerging at the consumer market, over-the-top (OTT) players, such as Netflix, has announced that HDR content will be delivered to the end-user. Standardization bodies are working on specifying the requirements for HDR. For instance, in the roadmap for DVB, UHDTV1 phase 2 will include HDR support. MPEG is currently working on exploring how HDR video could be compressed.
HDR imaging is a set of techniques within photography that allows for a greater dynamic range of luminosity compared to standard digital imaging. Dynamic range in digital cameras is typically measured in f-stops, where 1 f-stop is a doubling of the amount of light. A standard LCD HDTV using Standard Dynamic Range (SDR) can display less than or equal to 10 f-stops. HDR is defined by MPEG to have a dynamic range of over 16 f-stops. WCG is to increase the color fidelity from ITU-R 709 towards ITU-R 2020 such that more of the visible colors can be captured and displayed.
The amount of colors that is visible in different color spaces can be compared in the CIE 1931 XYZ color space as illustrated in FIG. 1. It can be seen that ITU-R 2020 (BT.2020) covers more of the visible color space than ITU-R 709 (BT.709). P3D65 (DCI P3) covers more than ITU-R 709 but less than ITU-R 2020.
Matrices have been defined that can convert red, green, blue (RGB) values for one color space into XYZ coordinates.
Since human vision is more sensitive to luminance than to chrominance, the chrominance is typically represented in lower resolution than the luminance. One format commonly used within video coding is Y′CbCr 4:2:0, also known as Y′UV 4:2:0. Here the Y′ component contains mostly luminance (but also some chrominance) and is therefore denoted luma to set it apart from true luminance. Likewise, the Cb and Cr components contain mostly chrominance but also some luminance, and are therefore called chroma to set them apart from true chrominance. The 4:2:0 notation means that the chroma components (Cb, Cr), will have a quarter of the resolution compared to the luma component (Y′), since the eye is more sensitive to the latter. The Y′CbCr representation is obtained by transferring the original linear RGB values into a non-linear domain R′G′B′ using a non-linear transfer function. Finally Y′, Cb and Cr are obtained using linear combinations of R′, G′ and B′.
Before displaying samples, the chroma components are first upsampled to 4:4:4, i.e. the same resolution as the luma component, and then the luma and chroma components in Y′CbCr are converted to R′G′B′ and then converted to linear domain (RGB) before being displayed.
Another approach is to encode RGB without any transformation to another color space but arrange the color components to mimic their respective relation to luminance and chrominance. Since green in RGB is more related to luminance than the red and blue color components another separation is to encode green as luminance and R and B as chrominance.
High Efficiency Video Coding (HEVC), also referred to as H.265, is a block based video codec standardized by ITU-T and MPEG that utilizes both temporal and spatial prediction. Spatial prediction is achieved using intra (I) prediction from within the current frame. Temporal prediction is achieved using inter (P) or bi-directional inter (B) prediction on block level from previously decoded reference pictures. The difference between the original pixel data and the predicted pixel data, referred to as the residual, is transformed into the frequency domain and quantized before transmitted together with necessary prediction parameters, such as mode selections and motion vectors.
By quantizing the transformed residuals the tradeoff between bitrate and quality of the video may be controlled. The level of quantization is determined by the QP. The QP is a key technique to control the quality/bitrate of the residual in video coding. It is applied such that it controls the fidelity of the residual, typically transform coefficients, and thus also controls the amount of coding artifacts. When QP is high the transform coefficients are quantized coarsely resulting in fewer bits but also typically more coding artifacts than when QP is small, where the transform coefficients are quantized finely. A low QP, thus, generally results in high quality and a high QP results in low quality.
In HEVC version 1 (v1), similarly also for H.264, the QP can be controlled at the picture level or at the slice level or at the block level. At picture and slice level it can be controlled individually for each color component, i.e. luma (Y′) and chroma (Cb, Cr). In HEVC v2, the QP for chroma can be individually controlled for the chroma components at the block level. In the equations below it is shown how the chroma QP can be modified at the picture level by pps_cb_qp_offset and pps_cr_qp_offset and at the slice level by slice_cb_qp_offset and slice_cr_qp_offset. For HEVC v2 it can also be modified at the block level by CUQpOffsetCb and CUQpOffsetCr.
The variables qPCb and qPCr are derived as follows according to the HEVC specification [1], see equations 8-257 and 8-258 in [1]:qPiCb=Clip3(−QpBdOffsetC,57,QpY+pps_cb_qp_offset+slice_cb_qp_offset+CuQpOffsetCb)qPiCr=Clip3(−QpBdOffsetC,57,QpY+pps_cr_qp_offset+slice_cr_qp_offset+CuQpOffsetCr)
The function Clip3(a, b, c)=max(a, min(b, c)), max(a, b) is equal to a if a>b and otherwise b, and min(a, b) is equal to a if a<b and otherwise b.
If ChromaArrayType is equal to 1, which corresponds to 4:2:0, the variables qPCb and qPCr are set equal to the value of QpC as specified in Table 1 below (corresponds to Table 8-10 in [1]) based on the index qPi equal to qPiCb and qPiCr, respectively. Otherwise, the variables qPCb and qPCr are set equal to Min(qPi, 51), based on the index qPi equal to qPiCb and qPiCr, respectively.
TABLE 1Specification of QpC as a function of qPi for ChromaArrayType equal to 1qPiCb/Cr<303031323334354637383940414243>43qPCb/Cr=qPiCb/Cr2930313233333434353536363737=qPiCb/Cr − 6
The chroma quantization parameters for the Cb and Cr components, Qp′Cb and Qp′Cr, are derived as follows (see equations 8-259 and 8-260 in [1]):Qp′Cb=qPCb+QpBdOffsetC Qp′Cr=qPCr+QpBdOffsetC 
This is described further in section 8.6.1 in [1].
Typically more bits are spent on the luma component than on chroma components since the human visual system is more sensitive to luminance. The chroma components are typically represented in lower resolution than the luma component since the human perception is not as sensitive to chroma. Having the same QP for the chroma and luma components can, however, lead to visual color artifacts. These color artifacts tend to be more visible for HDR video than for SDR video. However, it is not a solution to always use a lower QP value for the chroma components than for the luma component. This can result in encoding the chroma components unnecessarily well, especially at high bit rates (low QPs), thereby spending bits on chroma without visually improving the color.
There is, thus, a need for an efficient calculation of QP values for chroma components that solve at least some of the above-mentioned problems.