As used herein, the term ‘dynamic range’ (DR) may relate to a capability of the human visual system (HVS) to perceive a range of intensity (e.g., luminance, luma) in an image, e.g., from darkest grays (blacks) to brightest whites (highlights). In this sense, DR relates to a ‘scene-referred’ intensity. DR may also relate to the ability of a display device to adequately or approximately render an intensity range of a particular breadth. In this sense, DR relates to a ‘display-referred’ intensity. Unless a particular sense is explicitly specified to have particular significance at any point in the description herein, it should be inferred that the term may be used in either sense, e.g. interchangeably.
As used herein, the term high dynamic range (HDR) relates to a DR breadth that spans some 14-15 orders of magnitude of the human visual system (HVS). In practice, the DR over which a human may simultaneously perceive an extensive breadth in intensity range may be somewhat truncated, in relation to HDR. As used herein, the terms enhanced dynamic range (EDR) or visual dynamic range (VDR) may individually or interchangeably relate to the DR that is perceivable within a scene or image by a human visual system (HVS) that includes eye movements, allowing for some light adaptation changes across the scene or image. As used herein, EDR may relate to a DR that spans 5 to 6 orders of magnitude. Thus while perhaps somewhat narrower in relation to true scene referred HDR, EDR nonetheless represents a wide DR breadth and may also be referred to as HDR.
In practice, images comprise one or more color components (e.g., luma Y and chroma Cb and Cr) wherein each color component is represented by a precision of n-bits per pixel (e.g., n=8). Using linear luminance coding, images where n≤8 (e.g., color 24-bit JPEG images) are considered images of standard dynamic range, while images where n>8 may be considered images of enhanced dynamic range. EDR and HDR images may also be stored and distributed using high-precision (e.g., 16-bit) floating-point formats, such as the OpenEXR file format developed by Industrial Light and Magic.
A reference electro-optical transfer function (EOTF) for a given display characterizes the relationship between color values (e.g., luminance) of an input video signal to output screen color values (e.g., screen luminance) produced by the display. For example, in Ref.[1], ITU Rec. BT. 1886 defines the reference EOTF for flat panel displays based on measured characteristics of the Cathode Ray Tube (CRT). Given a video stream, information about its EOTF is typically embedded in the bit stream as metadata. As used herein, the term “metadata” relates to any auxiliary information that is transmitted as part of the coded bitstream and assists a decoder to render a decoded image. Such metadata may include, but are not limited to, color space or gamut information, reference display parameters, and auxiliary signal parameters, as those described herein.
Most consumer desktop displays currently support luminance of 200 to 300 cd/m2 or nits. Most consumer HDTVs range from 300 to 500 nits with new models reaching 1000 nits (cd/m2). Such conventional displays thus typify a lower dynamic range (LDR), also referred to as a standard dynamic range (SDR), in relation to HDR or EDR. As the availability of HDR content grows due to advances in both capture equipment (e.g., cameras) and HDR displays (e.g., the PRM-4200 professional reference monitor from Dolby Laboratories), HDR content may be color graded and displayed on HDR displays that support higher dynamic ranges (e.g., from 1,000 nits to 5,000 nits or more). Such displays may be defined using alternative EOTFs that support high luminance capability (e.g., 0 to 10,000 nits). An example of such an EOTF is defined in SMPTE ST 2084:2014 (Ref.[2]). In general, without limitation, the methods of the present disclosure relate to any dynamic range higher than SDR.
As used herein, the term “reshaping” refers to a pre-processing operation on an HDR image, such as scaling, quantization, and the like, to map it from its original bit depth to an image of the same or lower bit depth, to allow for more efficient coding using existing coding standards and devices. ‘Forward reshaping’ parameters used by an encoder may be communicated to a receiver as part of the coded bitstream using metadata so that a compliant decoder may apply an ‘inverse’ or ‘backward reshaping’ operation to reconstruct the original signal at its full dynamic range. Reshaping may be applied to any one or all of the color components of an HDR signal. In some embodiments, reshaping may also be constrained by the requirement to preserve on the decoded image the artistic intent of the original, for example, in terms of the accuracy of colors or “look,” as specified by a colorist under the supervision of the director.
Existing reshaping techniques are typically scene-based. As used herein, the term “scene” for a video sequence (a sequence of frames/images) may relate to a series of consecutive frames in the video sequence sharing similar luminance, color and dynamic range characteristics. Scene-based methods work well in video-workflow pipelines which have access to the full scene; however, it is not unusual for content providers to use cloud-based multiprocessing, where, after dividing a video stream into segments, each segment is processed independently by a single computing node in the cloud. As used herein, the term “segment” denotes a series of consecutive frames in a video sequence. A segment may be part of a scene or it may include one or more scenes. Thus, processing of a scene may be split across multiple processors. To improve existing coding schemes, as appreciated by the inventors here, improved techniques for segment-based reshaping of HDR video are needed.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section. Similarly, issues identified with respect to one or more approaches should not assume to have been recognized in any prior art on the basis of this section, unless otherwise indicated.