Saturation is a more difficult technical quantity than it would seem to lay men who get a small part of the story. Psychovisually, it is one of the meaningful properties of a color (of an object) that can be derived by the brain having an eye with three different cone types. Photometrically/Colorimetrically, it is a measure of how close a color is to a monochromatic spectrum of the same hue, or in other words how far it is from white, and then it is typically called purity. There it relates with such physical properties like e.g. an amount of colorant in an outer layer of an object. Mathematically for image color processing technology, it relates to some radial axis processing in some color space, by moving closer to or further away from some white. And this processing has to model how nature generates purities, and more specifically, how humans would perceive them. Furthermore, since colorfulness is dependent on luminance, the brain can derive three kinds of “saturation”, which are called colorfulness, chroma, and saturation respectively, but in this text for simplicity we will use the word saturation, as it will be clear to the skilled person what we mean.
Because another source of “complexity” is that saturation processing (and definitions) can be done in all kinds of spaces (with cylinder respectively cone or biconal shape), some of which are inherently scaled to the gamut extent, but have a deformed relationship compared to the linear nature of light, and others which may be simple, but can lead to clipping, especially because of the non-linear tent shape of 3 (or multi) primary additive displays.
Prior art has yielded a body of general saturation teachings, which were logically mostly concerned with image processing, usually the beautification of images. If one models human preferences for images, then one thing which stands out generally is that humans oftentimes prefer highly saturated images. Of course also desaturated pastel images may be beautiful, and hence color technology needs to be able to make those colors too. And, do so given the particulars of a technology, which we will assume here (non-limiting, for simplicity of elucidation) to be that of additive displays.
There was also specific prior art in the area of camera capturing, and formatting the captured images in the limited gamut of a standard image format, like e.g. PAL. Note that all saturation-related prior art is going to have at least some similarities. One should note that because saturation works in a gamut along outwards direction from the (zero saturation) luminance axis, all equations will likely have some form of the type color component minus unsaturated luma component. However, differences between some coordinate system with e.g. two particular chromatic dimensions versus some three-dimensional specification of the mathematics, and in particular linear versus non-linear color representations and the handling of color transformations in those respective spaces, may have very different colorimetric behavior, i.e. need to be contemplated with care, and hence are not trivially transformed into each other. It should also be understood that a mathematical transformation applied may be linear, e.g. a color matrixing, but if it is applied in a non-linear space, the final colorimetric behavior is still non-linear (because of the extra non-linearity of the transformation to that secondary color space, the physics of light and colorimetry ultimately being linear, as representatble in e.g. some linear RGB representation which can be used to stimulate the primary conopsin reaction in cones).
EP0647069 deals with a system for correctly formatting camera output in regions of higher brightness, so that e.g. no or reduced facial discoloration happens. These cameras have a so-called soft-clipper or knee circuit (FIG. 1, 2) on the luma component, to diminish yet keeping a little bit of the highlights. One should also mind that older type cameras based on scanning might have had a somewhat more non-linear overflow behavior, but CCD cameras would typically have a full-well limitation already, but still, such knee behavior could still help to keep a few highlights in the image, whilst not darkening the main part with the actors too much. In terms of this kneeing system behavior, this patent then adds a saturation-affecting circuit part. Firstly we'd like to note that these systems were not really intended to optimally handle HDR scenes, which might have higher brightness regions which are 10 or even 50 times brighter than the main brightness region on which the camera exposed. One kept a soft knee but clipped most of the really brighter regions to white anyway with that circuit. Secondly it is important to understand, that such cameras use non-linear R′G′B′ color components, and a non-linear luminance component Y′, which we will call for clarity luma (because that is what the YUV of PAL is). This means that the colorimetric behavior, whether in the upper parts of the encodable color gamut or anywhere else, will be different, and circuits even if somewhat similar will not be equatable. Lastly and most importantly, this system just gets whatever K-factor comes out of component 7 (and which is certainly not solely a function of V-Y), but definitely has no means to set a specification as desired by a grader, i.e. any particular saturation-modification behavior, as the specific nature of the current e.g. HDR shot may desire.
DE19812526 is another color saturation modification circuit which of course works on color difference components (again non-linear ones because that is what the standard nomenclature PR and PB stands for), but in addition to the above differences again, like a not allowing for a human-specifiable saturation gain, this patent works only on two color differences and is even further removed from any usable teaching towards our below described technologies. It was well-known that such “UV”-type saturation circuits existed priorly, because it was a very cheap convenient manner to do some saturation processing in a television which already got YUV video as input, i.e. before conversion to the required RGB for controlling the display.
EP1333684 is merely a specific saturation processing method for a display, which guarantees that the largest possible saturation boost doesn't clip. It was well-known that by boosting a color component, which may decrease a color component below 0, or create a luminance above Ymax, or create any undesirable color, which is (or rather should be because the electronics will clip it to an incorrect value) outside the RGB gamut of all possible definable colors, i.e. cannot actually be rendered on a television. The result of the clipping to a color which is displayable, but wrong, will have effects on the luminance or even the hue of that color, i.e. show it as a different hue from what it is supposed to be. That can be an annoying error, especially compared to a (small) desaturation. These kinds of system are typically for small errors, e.g. do to noise or filter behavior [0007], but not involved with potentially large e.g. desaturations needed for dynamic range conversion. This leads to several differences, e.g. that they monitor a maximum Ymax, which our below methods don't do, and in principle we can with the below methods design saturation strategies which do go outside gamut (i.e. clip) for a considerable amount of image colors, all depending on what the human color grading judges suitable for this image being processed. In addition to all that of course also the same differences like with the first mentioned patent exist, in particular that there is no means taught or even vaguely inspired for allowing a human to determine a specific (V-Y)-based saturation strategy, let alone for optimal coding of two image looks of considerably different dynamic range on a range of display peak brightnesses which should be supportable with an optimally graded image of a HDR scene.
EP0677972 is about handling colorimetric behavior above a particular set reference value s (again in a non-linear color space, NTSC, which was the television color space in Japan in the 90s). As can be seen in FIG. 3C, although the circuit comprises a saturation part, it is not about regulating a saturation behavior if not everywhere at least in a substantial part of the color gamut, but only in the highest tip of the RGB gamut. Of course having different design rationale etc., a very different technical system and teaching comes out. E.g. in one circuit embodiment there is a maximum detector, but of R′G′B′ rather than R-Y, G-Y,B-Y, which gives very different numerical values, and even globally very different colorimetric behavior for different reasons.
U.S. Pat. No. 7,743,114 is yet another particular, irrelevant for our purposes, manner to create a saturation strategy which doesn't allow out-of-gamut problems which give severe colorimetric errors. As shown in FIG. 3, it deals with the fact that if one applies saturation processing mathematics in YCrCb (which is the non-linear digital equivalent of YUV), one may get colors which although still existing, i.e. representable, in a [0,255] YCrCb cube, again correspond to unrepresentable RGB (one can verify that even slightly rotating a cube of the same dimensions would already lead to regions codable in one cube which fall outside the rotated one). So one must be careful not to saturate too much. It happens that one can guarantee to use an appropriate maxsat by checking two conditions which correspond to lines, which are mathematically however very different from what we describe below, and again miss features like speciafibility of any desired g(V-Y) function, etc. Or stated yet in another manner, none of the colorimetric behavior as we have in e.g. FIG. 5 is realizable with any of the above circuits, even in case of the very simplest imaginable g(V-Y) specifications.
The below invention embodiments were worked out in particular in a framework of newly emerging high dynamic range image handling (processing and in particular coding for sending over wired or wireless video communication networks to receivers in other long distance or short distance locations, such as home computers, consumer TVs, settopboxes, but also e.g. professional movie theatre video receiving apparatuses etc.).
U.S. Pat. No. 8,218,625 describes a method to communicate both an LDR image and a HDR image to a receiver. However, the method only seems to disclose the requirement for tone, i.e. luminance remapping, and not what should be done with the colors, let alone a need for a specific way of handling color saturation as in the present application. Furthermore the method of '625 would seem to communicate the HDR image as an LDR image plus an image of ratios (L HDR/L LDR), i.e. seems to be at least prima facie an incompatible approach with the one we envisage as a useful HDR encoding application for our new saturation processing, i.e. it is not prima facie clear how it would relate to or lead to any of our below embodiments.
The requirements for this field of HDR video technology are different than what off-the-shelf saturation knowledge can cater for. In particular, we have developed a HDR video encoding framework which can encode a number of images (for one time instant of presentation) intended for displays of various dynamic range capabilities (in particular peak brightnesses of say 5000 nit, 1200 nit, and 200 nit), which bundle of images encoding needs to actually transmit only one image (per time instant) for a reference dynamic range of e.g. 1500 or 1000 or 100 nit, and as metadata a number of functions which are used at the receiver side e.g. in the settopbox to calculate at least one second image, e.g. for a connected display of 700 nit (see WO2011/107905, WO2014/041471, WO2014/056679). E.g., consider we transmit or store for later use an image optimally color graded by its creator for presentation on a 5000 nit High Dynamic Range (HDR) reference display, which image is supplemented with metadata color mapping functions for deriving a 100 nit image for rendering on an Low Dynamic Range (LDR) display. The receiving e.g. television of 100 nit peak brightness, will apply those color mapping functions to the 5000 nit HDR image, to get its appropriate 100 nit LDR image. That LDR image, or rather the metadata functions functionally encoding it, was typically also color graded by the creator as an image which looked reasonable on LDR displays (e.g. a close approximation of the HDR image rendering on a HDR display, given of course the limitations of an LDR display). We have elaborated a number of luminance mapping functions for mapping the luminances of pixels in the two images of different dynamic range, since primarily dynamic range mapping is an optimization giving the original luminances of objects in a e.g. HDR image corresponding luminances in the range of possible LDR luminances (the other way around mapping LDR to HDR is of course similarly possible, with different mapping functions). Dynamic range mapping is a general color mapping problem, not just involving transformation of the luminance color component. E.g., as said above, the colorfulness of an object depends on its luminance, so if we need to map because of gamut/dynamic range technical limitations or artistic choices the luminances of an object in the first image to darker luminances in the second image, then the creator may want to combine this with a boost of the saturation of that object. Also, because the tent-shaped gamut of a (e.g.) 3-primary additive display is rather complex, and pointed as a function of luminance, it may be useful if also for that optimization color control is possible.
However, especially when the metadata needs to be co-encoded and transmitted, there are technical limitations leading to solutions which would be more or less practical. So if any color processing methods or apparatuses have to be consistent with potential remote communicated specification of color processing, they have to conform to those limitations. Even more so because some of the video communication modes in the various different technologies needing HDR video coding (e.g. HDMI) may involve a limited bandwidth or number of data words per image to transmit any metadata. So one must smartly chose which functions are transmitted, since in this framework they determine which images can be encoded, but also determine how reasonably or unreasonably complex the receiving integrated circuits should always be, since they have to do the color mapping. Moreover, the functions also form a toolbox of functions allowing a grader to encode new dynamic range looks for his image content, so those tools should not be too few, nor too many, not be too complex, and in particular have a good major order impact on the color transformation of the image, in particular those aspects which are mostly needed for dynamic range transforms.
It was a problem that neither such adapted color processing tools, nor knowledge sufficiently inspiring how to develop them, was generally available, so we had to develop them. In WO2014128586 we introduced a saturation processing definition as function of a brightness variable like luminance, but we desired another useful saturation adaptation definition, and we will describe embodiments below.