Conventionally, the dynamic range of reproduced images has tended to be substantially reduced in relation to normal vision. Indeed, luminance levels encountered in the real world span a dynamic range as large as 14 orders of magnitude, varying from a moonless night to staring directly into the sun. Instantaneous luminance dynamic range and the corresponding human visual system response can fall between 10.000:1 and 100.000:1 on sunny days or at night (bright reflections versus dark shadow regions). Traditionally, dynamic range of displays has been confined to about 2-3 orders of magnitude, and also sensors had a limited range, e.g. <10.000:1 depending on noise acceptability. Consequently, it has traditionally been possible to store and transmit images in 8-bit gamma-encoded formats without introducing perceptually noticeable artifacts on traditional rendering devices. However, in an effort to record more precise and livelier imagery, novel High Dynamic Range (HDR) image sensors that are capable of recording dynamic ranges of more than 6 orders of magnitude have been developed. Moreover, most special effects, computer graphics enhancement and other post-production work are already routinely conducted at higher bit depths and with higher dynamic ranges.
Furthermore, the contrast and peak luminance of state-of-the-art display systems continues to increase. Recently, new prototype displays have been presented with a peak luminance as high as 3000 cd/m2 and contrast ratios of 5-6 orders of magnitude (display native, the viewing environment will also affect the finally rendered contrast ratio, which may for daytime television viewing even drop below 50:1). It is expected that future displays will be able to provide even higher dynamic ranges and specifically higher peak luminances and contrast ratios.
HDR images may for example be generated by combining a plurality of low-dynamic range (LDR) images. For example, three LDR images may be captured with different ranges and the three LDR images may be combined to generate a single image with a dynamic range equal to the combination of the dynamic ranges of the individual LDR images.
In order to successfully introduce HDR imaging and to fully exploit the promise of HDR, it is important that systems and approaches are developed which can handle the increased dynamic range. Furthermore, it is desirable if functions are introduced which allow various elements and functions of LDR image processing to be re-used with HDR. E.g. it would be desirable of some of the interfaces, communication means or distribution mediums defined for LDR could be reused for HDR images.
One important feature associated with HDR imaging is that of how to efficiently encode HDR image data.
Recently several HDR encoding technologies have been proposed, like e.g. the dual layer method of Dolby as disclosed in WO2005/1040035.
In order to e.g. efficiently process HDR images it is in many scenarios important that the larger dynamic range of HDR typically represented by a relatively large number of bits is converted into a representation using a substantially reduced number of bits.
For example, in some scenarios, it may be advantageous to view HDR images on a display having an input interface developed for LDR. Thus, it may be desirable to generate values that can be treated as LDR values e.g. by the display interface. In other scenarios, it may be desirable to encode the HDR values with lower bit rates and with some backwards compatibility to LDR.
In order to represent LDR images in a suitable format, it is often used to employ a code allocation function which maps from HDR linear luminance values to suitable quantized luma codes. The HDR linear luminance values are often represented as e.g. floating point values with a relatively high number of bits per value (e.g. 16 bits). In contrast, the quantized luma codes typically represent luma values by a relatively low number of bits (e.g. 8 bits), and often as integer values.
The difference between LDR and HDR is not just the size of the dynamic range, Rather, the relative distribution of intensities in most scenes is also substantially different for LDR and HDR representations.
Indeed, HDR images/video typically have a different intensity distribution than the conventional (LDR) images/video. Especially the peak-to-average luminance ratio of high-dynamic-range image data is much higher. Therefore, the currently applied code allocation curves or electro optical transfer functions (EOTFs) tend to be sub-optimal for HDR data. Thus, if a conventional LDR mapping from HDR luminance values to encoded luma values is used, a significant image degradation typically occurs. For example, most of the image content can only be represented by a few code values as a large number of codes are reserved to the increased brightness range which is however typically only used for a few very bright image objects.
As an example of a practical scenario, color grading (see reference [1]) or color correction is an integral part of commercial film or photography production. In particular, it is part of the post-production stage (reference [2]). The color grading artist, or colorist, operates in a color grading suite, which provides color grading/correction tools as well as a real-time preview of the effects of color grading tool operations on the image or video being graded/corrected.
With the introduction of HDR cameras and displays for shooting and displaying HDR images and video, the color grading suite also has to be made suitable for grading this high-dynamic-range content. To facilitate the introduction of HDR, it is beneficial to enable the color grading/correction with minimal changes to existing tools.
Current standard dynamic range video, intended to be displayed on a reference monitor of e.g. 100 cd/m2 peak brightness, is usually encoded in current standard luma/luminance domains, which are specified using their log curves or EOTFs (electro-optical transfer functions). Examples of this are the curves used for sRGB (reference [4]) or ITU Rec. 709 (reference [5]) logarithmic data. The video data is sent in this logarithmic domain from the color grading tool (e.g. software on a PC) over a hardware interface (typically HD-SDI) to the preview display. The bit depth of the hardware interface is usually limited to e.g. 8 or 10 bits.
HDR images/video typically have a different brightness (e.g. when defined as display rendered luminance) distribution than current standard dynamic range images. For example, while the current video content distribution typically peaks around 20% of peak brightness (which means that the luma codes are nicely spread around the half of e.g. 255 values), HDR content may oftentimes typically peak around a much lower percentage, e.g. 1%, of peak brightness (data of at least the darker regions of the HDR images spread around the code 1/100th of code maximum). Thus, most of the relevant HDR content will be contained in only a few of the 8-bit or 10-bit video levels when it is encoded using current standard log curves. This will lead to severe and unacceptable quantization artifacts in the preview image, thus preventing the colorist to color grade/correct HDR images.
Accordingly, if conventional code allocations functions are used for HDR images in order to generate suitable codes for existing displays with such 8-bit or 10-bit input formats, a substantially reduced quality of the displayed image will result with e.g. most of the intensities present in the image being distributed over only a few input levels.
The code allocation function mapping linear light luminances as how they are to be seen upon display rendering to actual technical codes, or vice versa, have however largely been based upon LDR models (like gamma 2.2), but were optimal only for LDR displays of peak brightness of around 100 nit or cd/m2 (henceforth both the terms nit and cd/m2 will be used). If one so coarsely codes HDR video for transmission over a connection cable to a HDR display (e.g. peak brightness of 5000 nit) one risks seeing artefacts, such as banding in the darker parts of the video (e.g. banding in a dark blue sky, especially for fades).
Accordingly, in order to e.g. enable color grading of HDR images using the current color grading tools and interfaces, a different code allocation curve should be used for encoding the video data, such that a sufficient number of quantization levels is assigned to the most important video data.
However, finding a suitable code allocation function is not only critical but also difficult. Indeed, a challenge when determining code allocation functions is that of how to best map between the input luminance values and the luma codes. Indeed, this is a critical issue as the selected mapping has a strong impact on the resulting quality (e.g. due to quantization error). Furthermore, the impact on image quality may be dependent on the characteristics and properties of the images being encoded/decoded as well as the equipment used for rendering the images.
Of course, the simplest approach would be to simply use a uniform quantization. However, such an approach tends to result in suboptimal performance in many scenarios. Accordingly, code allocation functions have been developed wherein a non-uniform quantization has been applied. This may specifically be performed by applying a non-linear function (luma code mapping/tone mapping function) to the input luminance values followed by a linear quantization. However, as mentioned, it has been found that the defined functions in many scenarios provide a suboptimal result. For example, applying a code allocation function to HDR images in order to e.g. allow these to be processed by LDR circuits with a relatively low number of bits per value (typically 8 bits) tends to result in suboptimal conversion of the HDR image and specifically in the image values being concentrated around a few quantization levels/codes.
Although it may be possible to develop and define explicit functions that are specifically optimized for e.g. HDR images, this may be impractical in many scenarios. Indeed, such an approach requires individual and specialized functions to be developed for each scenario. Furthermore, it typically requires a large number of possible functions to be available for selection in order to compensate for differences in the images and/or equipment.
This further complicates operation and introduces additional resource requirements.
E.g. using dedicated functions not only requires the encoder to communicate which specific functions are used but in addition both the encoder and decoder need to store local representations of all possible functions. This will substantially increase the memory storage requirements. Another option would be for the encoder to encode data which fully defines the code allocation function used but such an approach will substantially increase the data rate.
Furthermore, using dedicated and explicit HDR code allocation functions will in many scenarios require substantial work in standardizing and specifying suitable functions. Furthermore, backwards compatibility will be a problem as existing equipment will not support new functions. E.g. existing circuitry will not be able to support new functions defined specifically to support HDR images.
Accordingly, an improved approach for providing and/or generating code allocation functions would be advantageous.