In color processing technology, a useful category of color mappings is one in which the luminance of a pixel color changes, yet the intrinsic color itself, which can be characterized e.g. with a chromaticity like CIE 1976 (u′,v′), is the same for the resultant output color and the to be processed input color. This is in fact the color mapping corresponding to lighting changes in nature: illuminating an object spectrum with more light having a light spectrum, produces colors with an increasing luminance (or another luminance correlate like e.g. a luma after conversion to this correlate) yet with the same chromaticity. This kind of processing for a technology which has recently become important, namely dynamic range mapping for image(s) and video. Currently displays are getting higher peak brightness (PB) values than the legacy so-called low dynamic range (LDR) displays, which have a PB of around 100 nit. One can speak of a high dynamic range (HDR) display, if it has a PB of at least 1000 nit, and typically displays of e.g. 2000, 5000 or 10000 nit are envisaged for the near future.
The same image cannot be rendered on both a LDR display and a HDR display and have a good look on both. E.g., if a HDR image (which is an image which may typically have object pixel luminances which have a luminance ratio of at least 1000:1, and typically a luminance distribution with both a significant amount of bright pixels, and a significant amount of dark pixels, i.e. approximately a thousand times darker than the bright ones) is rendered unmapped on a LDR display, part of the scene will yield indiscriminable black objects. And vice versa, an LDR image may have objects which look undesirably bright when directly rendered on a display of say 3000 nit PB. So one must color map a HDR grading image of a captured (or computer generated) scene, which corresponds to a HDR reference monitor, and is suitable for being rendered on displays of higher PB (than e.g. 1000 nit, or an associated minimum usable PB), to an LDR image, associated with an LDR reference monitor (and the dynamic ranges of the images, or more precisely their associated reference monitors, are different by a factor of at least 1.5, in other words approximately at least one stop or factor 2, but it may also be e.g. a factor 4 or 10 or larger). It may be noted that with the LDR and HDR images (or their reference monitor), there may also be different standard electro-optical transfer functions (EOTF) defined, which fix the relationship between technical luma codes for image transmission, and the actual luminances which correspond to those lumas when rendering on a reference monitor. LDR may use e.g. the legacy Rec. 709 gamma-type EOTF, and HDR image(s)/video may typically be encoded according to an EOTF which has at least partially an exponential (inverse logarithmic) character in its functional shape definition, e.g. following human vision characteristics like a Barten contrast sensitivity function. Incidentally, we would like to emphasize that this also means that a HDR image not necessarily has a larger amount of bits per color component than an LDR image. They may both be defined in e.g. a 3×10 bits RGB format (which can be interpreted as a [0-1.0]-scaled component definition, which we assume will be the codification of images before color mapping whatever the original and final format for the result to be transmitted are), the difference only residing in how the encoded colors have to be interpreted, i.e. according to the luminance or luma distribution defined with the given respective EOTF (and optimally color graded to yield the good corresponding look, i.e. with different object luminances for the same objects in both gradings).
A prior art which is best for elucidating the present invention and its technical contribution is previous research from applicant on a HDR-capable video codec, which we herewith further improve. Applicant has created a coding system, which allows one to encode at least two (or more) gradings of a scene, one typically being an LDR grading, and a second one being of higher dynamic range which is typically an HDR grading. The encoding works by encoding one of the two gradings as an actual image, which typically can be done by using classical video encoding containers, from a video coding standard like e.g. MPEG-HEVC. In some variants the data is reformatted in a non-compliant manner, e.g. the YCrCb color planes are filled with Yu′v′ data, but as long as the data fits the amount of available memory space, and the original images can be decoded, such principles can be used, and are compatible with legacy technology at least for those technical components which don't need to do the final decoding, like e.g. a satellite transmission system, etc.
FIG. 1 shows an example of such an encoding apparatus 100. An image source 120, say a hard disk delivers an input image Im_R1 of a first luminance dynamic range, say a HDR grading (a grading being a determination of the object luminances in the image so that to the creating artists they look correct when rendered on the associated reference display, the PB of which is typically co-encoded with the image as image-kind describing metadata). By means of a color mapper 121, the grader can chose via user data UI from e.g. a color grading keyboard and in a grading software one or more functions and their parameters to derive a second graded image for each HDR image (i.e. Im_R1), say an LDR image. Say he creates e.g. the LDR image by applying some S-curve, which function we call F, which transforms all input colors Ci of Im_R1 in resultant output colors Co of a resultant image I_mo (of course various color mappings can be done, e.g. a saturation correction, etc.). As output to our encoder 101, one of the two images Im_R1 and I_mo is selected as basic image encoding of the scene, i.e. encoding the geometric look of all objects, and in addition having a correct colorimetric look for one corresponding reference monitor. Also, all data required for uniquely specifying the color transformation is transmitted to the encoder 101 as function metadata F, which for the S-curve example may be e.g. endpoints of a middle slope part, i.e. two times 2 coordinates. The encoder 101 represents this data as numbers according to a prescribed format, e.g. it makes DCT-based image components of Im_R2, and stores the functional metadata in SEI messages, or similar, and via a data output 130 sends it to a data communication technology 131 (which in the Figure is a BD disk e.g., but this can also be an internet server for VOD, etc.).
Such dynamic range transformation is far from obvious however. In the real physical world this would only involve scaling the luminance as said above, but actual technologies have to deal with technical limitations. Instead of pure scaling, the color transformations for RGB displays can almost scale nothing in a simple manner. Not only has an actual display a limited color gamut because of its fixed maximal amount of backlight (or driving for non-backlit displays), which is tent-shaped, but a highly skewed tent which is much lower at the blue primary than near the yellows, but even all mathematical RGB spaces have the same properties. So mere scaling risks that one arrives at non-reproducible colors, which without careful handling typically results in clipping, and result color errors (at least saturation changes, but likely also hue changes). Artistically this may not be what a grader desires, that his nicely chosen orange-ish color for say a starfish suddenly becomes predominantly yellowish in the other grading. Ideally, the grader would not mind, even stronger expect, a luminance change, but he would like that particular carefully chosen orange to stay the same in all calculated gradings starting from his master grading (say a HDR master grading).
In WO2014/056679 applicant describes a luminance changing transformation which allows one to specify a luminance mapping strategy for various possible object luminances in an input image (Im_R1), which yields different output luminances but the same color chromaticities. I.e. this is a framework to specify specific variants of the functions F. FIG. 2 of this patent resummarizes key aspects of the principle. Any input pixel color is transformed to a linear RGB representation. We only show the core components for elucidating the principle, and of course there may have been various color transformations prior to determining the red (R), green (G) and blue (B) color components, e.g. those of the basic image (Im_R2) may have been saturation processed etc. The color mapper 200 may be a component of a decoder which calculates the original image(s) based on the received basic image and functional color mapping metadata of at least a luminance mapping function (F), but it may similarly be incorporated in an encoder, when the grader is still trying various possible functions F for the best look, and ultimately outputting the data for that best look over the data output 130. Maximum calculation unit 201 calculates which of the three components is for this pixel the highest, which may e.g. be the blue component if the object wherein the pixel resides is blue. This yields the maximum M. Then a predetermined function F (which realizes the dynamic range changing required color mapping, as far as the brightness part is concerned) is applied to M, yielding F(M). In the example of FIG. 2 this function boosts the darkest colors, then keeps the middle colors approximately equal to their original values, and somewhat boosts the bright values again. This could be a useful function for e.g. mapping a HDR to a LDR grading. Instead of applying this function to the luminances themselves, brightness mapper 202 applies this function (which may be composed of partial functions, or realized as a LUT) to M. The scaling parameter calculator 203 calculates a scaling parameter a by dividing F(M) by M. Finally a multiplier 204 uses this scaling parameter a to multiply it with each color component (R, G and B), yielding appropriately scaled color components (Rs, Gs, Bs). By this action the correct output luminance is obtained for the resultant pixel color, but with preservation of the original chromaticity. The brightness mapper kind of de-skews the tent shape of the RGB gamut by using this principle. As long as one ensures that the mapping function 205 is so determined that for the maximum input (M=1.0) the output F(M) is not higher than 1, then for all chromaticities the processing is correctly scaled to the upper gamut boundary, so that no colorimetric errors due to clipping can occur.
This is hence a powerful system which gives the grader great colorimetric control over the look of his images. It does come with a minor problem however. Although for many images the results are good, it can be seen that for some images there is a noisy problem. Movies which are scanned from celluloid, have film grain. This occurs e.g. in the darks in the negatives, and so for the brights in the positive (i.e. even if a master negative is scanned, after negative-to-positive calculation). Also the inherent expected, and with which the human visual system has co-evolved to be less sensitive to, luminance-dependent photon noise gets redistributed to other grey values because of the inversion, and any luminance mapping. In short, there may be noise in various objects where it wouldn't be desired. This was not annoyingly visible on legacy LDR displays, but since the high PB of HDR displays makes everything beautiful but also more visible, so will the grain become sometimes objectionably highly visible. But also on LDR displays, sometimes the noise of a color mapped HDR grading yielding the LDR grading can become highly visible.
The problem occurs because the scaling parameter a picks up the noise in the dominant color component, and so becomes noisy itself. Say we have a relatively uniform area (which is always a bad region for conspicuousness of noise), like the air, or a near-neutral which is slightly blue. The maximum component for all or most of those pixel will hence be the blue component, which may typically be noisy for several types of image. The M value will then jitter around a value of say 0.7. The functional mapping may correspond to a multiplication of say 3. But not only that, if there is a sharp discontinuity in the function F around 0.7, the higher M values may be boosted by 3 and the lower than 0.7 values may e.g. be multiplied by 1. This shows a noisy effect on the scaled colors (Rs,Gs,Bs), which sometimes get considerably brightness boosted and sometimes not, but on an interspersed pixel-by-pixel basis. I.e., sometimes the noise can become boosted to an unacceptable level, although in all other aspects the colorimetric look of the images is perfect. And of course, ideally the content creator does not want the receiving display to perform spatial blurring, since that may also decrease the sharpness of the desired parts of the image. The solution to this problem is not so straightforward. One might think one could smoothen the scaling parameters themselves e.g., but that doesn't seem to give good results. Herebelow we present a solution for this noise problem.