The present section is intended to introduce the reader to various aspects of art, which may be related to various aspects of the present principles that are described and/or claimed below. This discussion is believed to be helpful in providing the reader with background information to facilitate a better understanding of the various aspects of the present principles. Accordingly, it should be understood that these statements are to be read in this light, and not as admissions of prior art.
In the following, image data refer to one or several arrays of samples (pixel values) in a specific image/video format which specifies all information relative to the pixel values of an image (or a video) and all information which may be used by a display and/or any other device to visualize and/or decode an image (or video) for example. An image comprises a first component, in the shape of a first array of samples, usually representative of luminance (or luma) of the image, and a second and third component, in the shape of other arrays of samples, usually representative of the color (or chroma) of the image. Or, equivalently, the same information may also be represented by a set of arrays of color samples, such as the traditional tri-chromatic RGB representation.
A pixel value is represented by a vector of C values, where C is the number of components. Each value of a vector is represented with a number of bits which defines a maximal dynamic range of the pixel values.
Standard-Dynamic-Range images (SDR images) are images whose luminance values are represented with a limited number of bits (typically 8). This limited representation does not allow correct rendering of small signal variations, in particular in dark and bright luminance ranges. In high-dynamic range images (HDR images), the signal representation is extended to maintain a high accuracy of the signal over its entire range. In HDR images, pixel values representing luminance levels are usually represented in floating-point format (typically at least 10 bits per component, namely float or half-float), the most popular format being openEXR half-float format (16-bit per RGB component, i.e. 48 bits per pixel) or in integers with a long representation, typically at least 16 bits.
The arrival of the High Efficiency Video Coding (HEVC) standard (ITU-T H.265 Telecommunication standardization sector of ITU (10/2014), series H: audiovisual and multimedia systems, infrastructure of audiovisual services—coding of moving video, High efficiency video coding, Recommendation ITU-T H.265) enables the deployment of new video services with enhanced viewing experience, such as Ultra HD broadcast services. In addition to an increased spatial resolution, Ultra HD can bring a wider color gamut (WCG) and a higher dynamic range (HDR) than the Standard dynamic range (SDR) HD-TV currently deployed. Different solutions for the representation and coding of HDR/WCG video have been proposed (SMPTE 2014, “High Dynamic Range Electro-Optical Transfer Function of Mastering Reference Displays, or SMPTE ST 2084, 2014, or Diaz, R., Blinstein, S. and Qu, S. “Integrating HEVC Video Compression with a High Dynamic Range Video Pipeline”, SMPTE Motion Imaging Journal, Vol. 125, Issue 1. February, 2016, pp 14-21).
SDR backward compatibility with decoding and rendering devices is an important feature in some video distribution systems, such as broadcasting or multicasting systems.
A solution based on a single layer coding/decoding process may be backward compatible, e.g. SDR compatible, and may leverage legacy distribution networks and services already in place.
Such a single layer based distribution solution enables both high quality HDR rendering on HDR-enabled Consumer Electronic (CE) devices, while also offering high quality SDR rendering on SDR-enabled CE devices.
Such a single layer based distribution solution generates an encoded signal, e.g. SDR signal, and associated metadata (of a few bytes per video frame or scene) that can be used to reconstruct another signal, e.g. HDR signal, from a decoded signal, e.g. SDR signal.
Metadata stored parameters values used for the reconstruction of the signal and may be static or dynamic. Static metadata means metadata that remains the same for a video (set of images) and/or a program.
Static metadata are valid for the whole video content (scene, movie, clip . . . ) and may not depend on the image content. They may define, for example, image format or color space, color gamut. For instance, SMPTE ST 2086:2014, “Mastering Display Color Volume Metadata Supporting High Luminance and Wide Color Gamut Images” is such a kind of static metadata for use in production environment. The Mastering Display Colour Volume (MDCV) SEI (Supplemental Enhanced Information) message is the distribution flavor of ST 2086 for both H.264/AVC (“Advanced video coding for generic audiovisual Services”, SERIES H: AUDIOVISUAL AND MULTIMEDIA SYSTEMS, Recommendation ITU-T H.264, Telecommunication Standardization Sector of ITU, January 2012) and HEVC video codecs.
Dynamic metadata are content-dependent, that is metadata can change with the image/video content, e.g. for each image or when each group of images. As an example, SMPTE ST 2094:2016 standards families, “Dynamic Metadata for Color Volume Transform” are dynamic metadata for use in production environment. SMPTE ST 2094-30 can be distributed along HEVC coded video stream thanks to the Colour Remapping Information (CRI) SEI message.
Other single layer based distribution solutions exist on distribution networks for which display adaptation dynamic metadata are delivered along with a legacy video signal. These single layer based distribution solutions may produce HDR 10-bits image data (e.g. image data which signal is represented as an HLG10 or PQ10 signal as specified in Rec. ITU-R BT.2100-0 “Recommendation ITU-R BT.2100-0, Image parameter values for high dynamic range television for use in production and international program exchange”) and associated metadata from an input signal (typically 12 or 16 bits), encodes said HDR 10-bits image data using, for example an HEVC Main 10 profile encoding scheme, and reconstructs a video signal from a decoded video signal and said associated metadata. The dynamic range of the reconstructed signal being adapted according to the associated metadata that may depend on characteristics of a target display.
ETSI TS 103 433 V1.1.1 published in August 2016 proposes a single layer distribution solution that addresses direct backwards compatibility i.e. it leverages SDR distribution networks and services already in place and that enables high quality HDR rendering on HDR-enabled CE devices including high quality SDR rendering on SDR CE devices. Some elements of this specification are detailed below in the description of FIG. 2.
A display adaptation method has been proposed in this standard. This method aims at adapting a reconstructed HDR signal to a luminance level that corresponds to the display luminance capability. For example, a reconstructed HDR can be 1000 cd/m2 (nits) while the display used to render the image can only render up to 500 nits. One objective of this display adaptation is to maintain the creative intent captured in the mapping between SDR and HDR as best as possible. It is using recalculated metadata values used in the tone mapping operations based on the ratios between the original HDR peak luminance, the targeted SDR peak luminance (fixed to 100 nits) and the presentation display maximum luminance. The Display Adaptation can run down to 100 nits, thus providing a SDR signal compatible with SDR displays. However, this prior art display adaptation method showed unsatisfying results when used on displays having low peak luminance: firstly, some color shifting appears on displays with low peak luminance and secondly, for displays having a peak luminance of 100 nits, the generated SDR images at the post-processor output are different than the SDR images at the post-processor input which is not acceptable since the post-processor should act as a pass-through.
It can therefore be appreciated that there is a need for a solution for the reconstruction of an HDR video so that the image is adapted to the display on which it will be rendered that addresses at least some of the problems of the prior art, particularly on displays with low peak luminance. The present disclosure provides such a solution.