Display technologies being developed by Dolby Laboratories, Inc., and others, are able to reproduce images having high dynamic range (HDR). Such displays can reproduce images that more faithfully represent real-world scenes than conventional displays characterized by approximately three orders of magnitude of dynamic range (e.g., standard dynamic range SDR).
Dynamic range (DR) is a range of intensity (e.g., luminance, luma) in an image, e.g., from darkest darks to brightest brights. As used herein, the term ‘dynamic range’ (DR) may relate to a capability of the human psychovisual system (HVS) to perceive a range of intensity (e.g., luminance, luma) in an image, e.g., from darkest darks to brightest brights. In this sense, DR relates to a ‘scene-referred’ intensity. DR may also relate to the ability of a display device to adequately or approximately render an intensity range of a particular breadth. In this sense, DR relates to a ‘display-referred’ intensity. Unless a particular sense is explicitly specified to have particular significance at any point in the description herein, it should be inferred that the term may be used in either sense, e.g. interchangeably.
As used herein, the term high dynamic range (HDR) relates to a DR breadth that spans the some 14-15 orders of magnitude of the human visual system (HVS). For example, well adapted humans with essentially normal vision (e.g., in one or more of a statistical, biometric or ophthalmological sense) have an intensity range that spans about 15 orders of magnitude. Adapted humans may perceive dim light sources of as few as a mere handful of photons. Yet, these same humans may perceive the near painfully brilliant intensity of the noonday sun in desert, sea or snow (or even glance into the sun, however briefly to prevent damage). This span though is available to ‘adapted’ humans, e.g., those whose HVS has a time period in which to reset and adjust.
In contrast, the DR over which a human may simultaneously perceive an extensive breadth in intensity range may be somewhat truncated, in relation to HDR. As used herein, the terms ‘enhanced dynamic range’ (EDR), ‘visual dynamic range,’ or ‘variable dynamic range’ (VDR) may individually or interchangeably relate to the DR that is simultaneously perceivable by a HVS. As used herein, EDR may relate to a DR that spans 5-6 orders of magnitude. Thus while perhaps somewhat narrower in relation to true scene referred HDR, EDR nonetheless represents a wide DR breadth. As used herein, the term ‘simultaneous dynamic range’ may relate to EDR.
To support backwards compatibility with existing 8-bit video codecs, such as those described in the ISO/IEC MPEG-2 and MPEG-4 specifications, as well as new HDR display technologies, multiple layers may be used to deliver HDR video data from an upstream device to downstream devices. In one approach, generating an 8-bit base layer (BL) version from the captured HDR version may involve applying a global tone mapping operator (TMO) to intensity (e.g., luminance, luma) related pixel values in the HDR content with higher bit depth (e.g., 12 or more bits per color component). In another approach, the 8-bit base layer may be created using an adaptive linear or non-linear quantizer. Given a BL stream, a decoder may apply an inverse TMO or a base layer-to-EDR predictor to derive an approximated EDR stream. To enhance the quality of this approximated EDR stream, one or more enhancement layers (EL) may carry residuals representing the difference between the original HDR content and its EDR approximation, as it will be recreated by a decoder using only the base layer.
Some decoders, for example those referred to as legacy decoders, may use the base layer to reconstruct an SDR version of the content to be displayed on standard resolution displays. Advanced decoders may use both the base layer and the enhancement layers to reconstruct an EDR version of the content to render it on more capable displays. Improved techniques for layered-coding of EDR video are used for efficient video coding and superior viewing experience. Such techniques use advanced encoders which encode image information in a non-backward compatible format, which is incompatible with legacy decoders. More information on advanced encoders and associated decoders (e.g. codecs) can be found, for example, in the '932 application and the '926 application, which describe backward and non-backward compatible codecs developed by Dolby. Such advanced codecs which encode the image information in a non-backward compatible format can be referred to as “layer decomposed” codecs.
A visual dynamic range (VDR) codec, such as a layer-decomposed codec, can consist of three basic streams in the corresponding VDR combination stream, namely, a base layer (BL) stream, an enhancement layer stream (EL), and a reference picture unit (RPU) stream. Bit errors (e.g. packet loss) can occur during a transmission of the combo stream, such that some bits in some streams, such as for example in a portion of a stream (e.g. a packet), are corrupted. The BL and EL are the compressed video streams encoded using any legacy video codec (such as MPEG-2/AVC/HEVC), thus they exhibit decoding dependency characteristics. In other words, a bit error could cause a decoding failure not only in a current block and frame, but also propagate the decoding error to the following dependent frames. The RPU stream contains the composing metadata which can be used to transform BL and EL decoded pictures to VDR domain and combine the transformed data such as to provide the final VDR signal for viewing on a compatible display. The composing parameters can comprise mapping/prediction parameters for BL, non-linear de-quantizer parameters for EL, and mapping color space parameters. The RPU can be encoded as frame based (e.g. one specific RPU per frame), which avoids the decoding dependency, but bit error in each frame could result in loss of composing parameters (e.g. content) of the RPU and lead to erroneous reconstruction of content frames (e.g. based on EL and BL bitstreams) in the decoder. To provide enjoyable viewing experience, an error control mechanism is needed to conceal the errors caused during the bitstream (e.g. combo stream) transmission. More information regarding to the RPU stream content and a corresponding syntax can be found, for example, in the '397 application.
As compared to codecs used in a traditional single layer video transmission and/or in a traditional SDR/spatial scalable/temporal scalable dual layer video transmission, a layer-decomposed codec, such as described in the '932 application, has some unique features. For example, a highlight part of a scene (e.g. or a corresponding content frame) is encoded in the EL stream and a dark/mid-tone part of the scene (e.g. or a corresponding content frame) is encoded in the BL stream, using, for example, a clipping method as described in the '932 application. At the decoder, BL and EL streams are decoded independently and information pertinent to an original image (e.g. a content frame) co-exists in both layers (e.g. non-redundant information). Such non-redundant information and independent decoding can be used in developing methods and systems to conceal damaged pictures such as when embedded within, for example, an error control module, can result in a more enjoyable viewing experience of the decoded video stream.