Recently a number of very different displays have appeared on the market, in particular television signal receiving displays (televisions) with very different peak brightness. Whereas in the past the peak brightness (PB) of so-called legacy low dynamic range (LDR) displays differed by at most something like a factor 2 (somewhere between 80 and 150 nits), the recent trend to ever higher peak brightness has resulted in so-called high dynamic range (HDR) televisions of 1000 nits and above, and displays of 5000 nit PB, and it is assumed that soon various displays of such higher PBs will be on the market. Even in movie theaters one is recently looking at ways to increase the ultimate brightness dynamic range perceived by the viewer. Compared to a 100 nit LDR standard legacy TV, a e.g. 2000 nit display has a factor 20 more peak brightness, which amounts to more than 4 additional stops available, i.e. more ways to render brighter objects in various images. On the one hand, provided one uses also a new generation HDR image generation or capturing system, this allows for much better rendering of HDR scenes or effects. E.g., instead of (soft) clipping the sunny world outside a building or vehicle (as would happen in a legacy LDR grading), one can use the additional available brightnesses on the luminance axis of the HDR TV gamut to display bright and colorful outside areas. This means that the content creator, which we will call non limiting the color grader (but he may be embodied in various manners, e.g. in a live television production), has room to make very beautiful dedicated HDR image or video content (typically brighter, maybe more contrasty, and more colorful). On the other hand however, this creates a problem: LDR image coding was designed relatively starting from white, and well-illuminated according to a middle gray of 18% reflection, which means that typically display-rendered luminances below 5% of a relatively low PB of say 100 nit will typically be seen by the viewer as difficult to discriminate dark greys, or even depending on surround illumination indscriminatable blacks. On a 5000 nit display there will be no problem with this optimally graded HDR image: 5% of 5000 nit is still 250 nit, so this will look like a normal interior e.g., and the highest 95% of the luminance range could be used purely for HDR effects, like e.g. lamps, or regions close to such lamps i.e. brightly lit. But on an LDR the rendering of this HDR grading will go totally wrong (as it was also not created for such a display), and the viewer may e.g. only see hot spots corresponding to the brightest regions on a near-black region. In general, re-gradings are needed for creating optimal images for displays which are sufficiently different (at least a factor 2 difference in PB). That would happen both when re-grading an image for a lower dynamic range display to make it suitable for rendering on a higher dynamic range display (which would be upgrading, e.g. a 1000 nit reference display input image(s), i.e. which would look optimal on a 1000 nit PB actual display, which is then color processed for rendering on an actual display of 5000 nit PB), as the other way around, i.e. downgrading an image so that it would be suitable for display on an actual display of lower PB than the reference display associated with the grading which is coded as video images (and which images are typically transmitted in some manner to a receiving side). For conciseness we will only describe the scenario where an HDR image or images is to be downgraded to LDR.
HDR technology (by which we mean a technology which should be able to handle at least some HDR images, but it may work with LDR images, or medium dynamic range images, etc. as well) will percolate in various areas of both consumer and professional use (e.g. cameras, data handling devices like blu-ray players, televisions, computer software, projection systems, security or video conferencing systems, etc.) will need technology capable of handling the various aspects in different ways.
In Wo2013/144809 (and WO2014/056679) applicant formulated generically a technique to perform color processing for yielding an image (Im_res) which is suitable for another display dynamic range (typically the PB suffices to characterize the different display dynamic ranges and hence optimally graded images, since for several scenarios one may neglect the black point and assume it is pragmatically 0) than the reference display dynamic range associated with the input image (Im-in), i.e. which basically formulates the PB of a display for which the image was created as looking optimally, which forms good prior art for the below elucidated invention to improve thereupon. We reformulate the principles concisely again in FIG. 1, in a manner closer to current actual embodiments of the same principle. The various pixels of an input image Im_in are consecutively color processed by a color transformer 100, by multiplying their linear RGB values by a multiplication factor (a) by a multiplier 104, to get output colors RsGsBs of pixels in an output image Im_res. The multiplication factor is established from some tone mapping specification, which may typically be created by a human color grader, but could also come from an auto-conversion algorithm which analyses the characteristics of the image(s) (e.g. the histogram, or the color properties of special objects like faces, etc.). The mapping function may coarsely be e.g. gamma-like, so that the darker colors are boosted (which is needed to make them brighter and more contrasty for rendering on the LDR display), at the cost of a contrast reductions for the bright areas, which will become pastelized on LDR displays. The grader may further have identified some special object like a face, for which luminances he has created an increased contrast part in the curve. What is special now is that this curve is applied to the maximum of the R, G, and B color component of each pixel, named M (determined by maximum evaluation unit 101), by curve application unit 102 (which may cheaply be e.g. a LUT, which may be calculated e.g. per shot of images at a receiving side which does the color processing, after typically having received parameters encoding the functional shape of the mapping, e.g. a gamma factor). Then a multiplication factor calculation unit 103 calculates a suitable multiplication factor (a) for each currently processed pixel. This may e.g. be the output of the tone mapping function F applied to M, i.e. F(M), divided by M, if the image is to be rendered on a first target display, say e.g. a 100 nit LDR display. If an image is needed for e.g. an intermediate display, e.g. 800 nit PB (or another value, maybe higher than the reference display PB of the HDR input image Im_in), then a further function G may be applied to F(M)/M rescaling the amount of multiplicative mapping of the input color to the value appropriate for the display dynamic range for which the image is suited (whether it is directly rendered on the display, or communicated, or stored in some memory for later use).
The part we described so far constitutes a global color processing. This means that the processing can be done based solely on the particular values of the colors (and we will only focus on the luminances of those colors) of a consecutive set of pixels. So, if one just gets pixels from e.g. a set of pixels within a circular sub-selection of an image, the color processing can be done according to the above formulated principle. However, since human vision is very relative, also spatially relative, whereby the colors and brightnesses of objects are judged in relation to colorimetric properties of other objects in the image (and also in view of various technical limitations), more advanced HDR coding systems have an option to do local processing. In some image(s) one would like to isolate one or more object(s), like a lamp or a face, and do a dedicated processing on that object. However, in our technology, this forms part of an encoding of at least one further grading derivable from an image of pixels of a master grading (here LDR derived from HDR). Either the master grading or the derived grading may be actually communicated to a receiving side, as the images encoding the spatial structure i.e. the objects of the imaged scene, and if the color transformation functions encoding the relationship between the two looks are also communicated in metadata, then other gradings can then be re-calculated at a receiving side. I.e., the color processing is e.g. needed to construct by decoding an LDR image if needed, in case HDR images have been received, or vice versa a reconstruction of HDR images in case of the pair of looks the LDR images have been communicated, or stored. The fact that the local processing principle is used in an encoding technology has technical implications, inter alia that one needs a simple set of basic mathematical processing methods, since all decoding ICs or software out in the field needs to implement this, and at an affordable price, to be able to understand the encoding and create the decoder LDR image(s). The simple principle which is not too expensive in number of calculations yet sufficiently versatile that applicant introduced in Wo2013/144809, does a grader-specified dual testing by a region evaluation unit 108. This unit evaluates both a geometric and colorimetric condition. Geometrically, based on the coordinates of the current pixel (x,y), it checks e.g. whether the pixel is within a rectangle (x_s, y_s) to (x_e, y_e). Colorimetrically, it can e.g. check whether the luminance or max(R,G,B) is above a threshold (in which case the pixel is evaluated to belong to the to be specially processed region) or below (in which case it is not), or a more advanced evaluation of the color properties of the current to be processed pixel is performed. The color transformer 100 may then e.g. load another tone mapping LUT depending whether the pixel is not in the special region and to be globally processed or to be locally processed, or two parallel processing branches may be used etc.
So, a technical limitation is that from an IC point of view (since also cheap apparatuses may need simple ICs or area parts of an IC, or software), the coding function tools should be few, and smartly chosen, to do what is most needed for the creation and encoding of various dynamic range look images on a scene. On the other hand, a problem with that is that with our above explained philosophy, where a human color grader specifies the re-grading, as encoded by e.g. a HDR image and functions to re-grade to a suitable LDR image, in a set of optimal parameters for the specific look of a given scene, that the grader must also have the right grading/coding tools and in the right order so that he can conveniently work with them (not only does he need to obtain the good precision of the desired color look, but he needs to do that with as few operations as possible, since time is also of the essence). This dual opponent set of constraints need to be provided for in an elegant manner.
There are some teachings which prima facie may look similar, but being designed according to different rationales, are actually technically different. Nevertheless, for completeness we will shortly discuss and differentiate them.
E.g. U.S. Pat. No. 5,446,504 teaches a system being a camera with a better capturing dynamic range.
Similar to our system, is that even when a camera is able to capture a very large dynamic range (rather than clipping to full pixel well, and hence code 255 being untextured/object-detail-lacking white in the MPEG), it still needs to be shown on a LDR display (because nothing else existed in the 1990s).
But that doesn't mean one has a system which is so-designed to be also capable of recovering an original HDR look image, even if an LDR look image of the same scene was communicated, in case one has e.g. a 4000 nit peak brightness display available for rendering it in all its most beautiful and brightest colors.
The beam splitter of FIG. 1A allows one sensor to be sensitive and capture the darker part of a scene, and the other sensor (4b) to be less sensitive (i.e. the pixel wells fill up with photo-electrons much slower), and then the full range of all luminances present in the scene is added together in adder 6 (i.e. instead of having nicely captured colors up to luminance_threshold, and clipping above, now all the brighter pixel colors are still captured in a larger dynamic range).
This system however has only a logarithmic compression function (which arguably could be seen similar to our log function, but—especially on its own—technically has a very different meaning in U.S. Pat. No. 5,446,504), but clearly there is no second customizable re-grading function for which one can bend the non-linear shape however it is needed for a given HDR scene, which can be read being specified previously according to the needs of the particular image(s). Also, no maximum calculation is involved prior to doing the color transformation.
D3=Wo2014/025588 (Dolby laboratories) uses a different philosophy for HDR coding than applicant. Applicant sends just one image of the (HDR, LDR) pair and functions to calculate at the receiving side the other image. Dolby uses a 2-layer (i.e. two images sent, although one may be a simpler image than the one which contains the object textures) approach. They calculate first a “local lighting image”, and then a “object properties image”, which contains the objects textures, as if better lit. At the reconstruction side they can then obtain the original HDR image by multiplying the object image (the reflectivities) with the local lighting image. Because multiplications and divisions become simpler additions and subtractions, they may like to work in some embodiments in a logarithmic domain (see FIG. 5, subtracting the tone mapped layer TMO RGB, which is their LDR image, which should look reasonable on an LDR 100 nit display). They can then embody the HDR-to-LDR_layer mapping as a curve in the logarithmic axes domain, which can be automatically determined based on the histogram. The philosophy of the components is very different, and in particular a rigid autoconversion framework which cannot be supplemented with a precisely optimized luminance mapping curve from an external place at a creation side, let alone specified by a human color grader according to his specific artistic desires for changing the brightness of some specific object or its luminance range specifically so and so. Also their local illumination-based philosophy would make it very weird to use Max (R,G,B) instead of Ytm (FIG. 2), so that is why they don't teach it.