Displays are used in a very wide range of applications, including entertainment (e.g., television, e-books), advertisement (e.g., shopping malls, airports, billboards), information (e.g., automotive, avionics, system monitoring, security), and cross-applications (e.g., computers, smart phones)—there are literally hundreds of specific applications. As such, displays are generally subject to a wide range of viewing environments, and in many applications the viewing environment of the display is not a constant. Therefore, it stands to reason that if the viewing environment can change then the visual characteristics of the display might also warrant change in order to maintain optimal performance and fidelity. The primary visual characteristics of a display are brightness (often called contrast or picture), black level (confusingly called brightness), saturation (color intensity), hue (sometimes called tint), and sharpness. All five of these visual properties can be endowed with automatic adaptation to changing environmental viewing conditions.
A very high-level diagram of the image capture and reproduction process is shown in FIG. 1. Images typically originate from either real-world scenes captured by video/still cameras or from computer-generated scenery. The lofty goal of most reproduction systems is to display the most life-like image that is possible to the final human observer. There are very many impediments to doing this perfectly; in fact, some “enhancements” are often purposely added to the displayed image to improve the viewing experience. One of the major impediments to high-fidelity reproduction is that the local viewing environment of the final observer cannot be definitively predicted, yet the viewing environment can have a profound impact on the visual quality of the reproduction. Also, the viewing environment can change almost continuously except in a few special cases such as the tightly controlled environment of a theater.
A subtle but very germane aspect of FIG. 1 is that the total light that is reflected from a physical object is essentially the linear summation of the reflected light from all light sources that impinge upon the object. In addition, an object may also emit its own light, and this light is also linearly added to the reflected contributions from the object in order to arrive at the total observed light. This is basically a statement that incoherent light behaves linearly in this regard (e.g., 1+1=2). As a result the absolute brightness or luminance of any point in a scene is proportional to all constituent components of light that are traceable to that point. This is the reality that is presented to the human observer of a real scene, and is also the manner in which computer-generated scenery is typically created. Therefore, in theory a display device should also adhere to the principle of luminance linearity for the purest form of reproduction. Or more generally, the entire end-to-end chain of processes, from the light that enters a camera to the light that exits the display, should adhere to the principle of luminance linearity. This principle will be relevant to various aspects of the subject invention.
As mentioned above, the goal of a display should be to reproduce a life-like replica of the original scene. But there are several inherent and unavoidable limitations. One such limitation is the difficulty for a display to match the dynamic range of luminance that exists in the real world, especially at the upper end of the scale (e.g., the sun and reflections thereof). Another limitation is that a display is a predominately “flat” version of the original scene; hence true three-dimensional (3D) depth reproduction is not possible, although various “3D” technologies exist to produce the illusion of depth, at least from one or more specific perspectives. Also, common displays cannot begin to simulate the nearly hemispherical field-of-view of the human eye, although special venues such as IMAX® theaters attempt to overcome this. Finally, the display itself is a physical object that exists in some environment, and the environment itself can have a very significant impact on the visual quality of the reproduction.
In a traditional color display each pixel is typically comprised of 3 sub-pixels, one for each of the primary colors—typically red, green, and blue. While there are displays that may use 4 or more sub-pixels, the embodiments herein do not depend on the precise number of sub-pixels or colors that they represent. The information content of a displayed image is the result of uniquely commanding, or driving, each sub-pixel, with the specifics of the driving process being technology-dependent (e.g., CRT, plasma, LCD, OLED, etc.). The drive level of each sub-pixel can range from full off to full on—this is the fundamental process by which images are formed by a display. The total range of displayable colors (i.e., the color gamut) is obtained by varying the relative drive levels of the sub-pixels through their entire range of combinations. Non-primary colors are produced when the human eye integrates the 3 sub-pixels to produce an effective blended color via the controlled mixing of the primary colors. In the digital domain if the sub-pixel drive levels are defined with 8 digital bits then there can be a total of 28=256 distinct drive levels per sub-pixel. A gray level is a special case where all sub-pixels are being driven at the same level (as defined by VESA FPDM 2.0). This will generally produce a ‘gray-like’ color ranging from full off (lowest brightness, appearing predominately black) to full on (highest brightness, appearing predominately white). Continuing with 8 bits per sub-pixel (often called 24-bit color: 3 sub-pixels×8 bits=24) there are 224=16,777,216 possible colors, but only 256 unique gray levels by the strict definition that gray levels are produced when all sub-pixels are identically driven. For simplicity we shall speak of gray levels on a sub-pixel basis (i.e., 256 gray levels for 8 bits of control) with the implicit understanding that neighboring sub-pixels are not necessarily driven to the same level as required for the generation of color images. This is because the invention stands independent of color reproduction, but is completely compatible with color reproduction.
Gamma (symbolized by γ) refers to the mathematic exponent in a power function S7 that transforms the scaling of gray levels (on a sub-pixel basis) in an image. Although the roots of gamma processing trace back to the earliest days of vacuum-tube cameras and CRT displays, it is still a very relevant process in modern displays for improving the perceived resolution in the darker regions of an image where human vision is more sensitive to absolute changes in brightness.
The conceptually simplest image reproduction stream is illustrated in FIG. 2. Light from a real-world scene (Li) is captured by the camera and falls onto a detector (commonly a solid-state pixilated detector using CCD or CMOS technology) that performs an optical-to-electrical (O/E) conversion to generate the initial source image signal Ss. This image signal is typically a voltage signal that is approximately proportional to the amount of light falling on each pixilated detector element, but Ss may be immediately converted into a digital signal. Alternatively, the source image signal Ss may originate from computer-generated graphics that are typically developed in the linear domain in much the same way as light behaves in the real world. In either case, signal encoding occurs in the function block labeled ƒe, which typically (though not necessarily) takes the form of a power function: ƒe=(S)α. Historically the α exponent is referred to as a gamma-correction exponent, but herein it will be referred to more generally as a signal encoding exponent. The resulting encoded signal Se (=(Ss)α) then enters the display and is decoded by the function block labeled ƒd, which typically (though not necessarily) takes the form of another power function: ƒdi=(S)γ. By substitution the resulting decoded signal Sd (=(Se)γ) that drives the display is related to the initial source image signal Ss via Sd=(Ss)αγ. It is noted that in practice there are variants to the relatively simple transformations described above, but the general process of encoding and decoding image signals is the same.
Referring still to FIG. 2, the decoded image signal Sd is then used to drive the components in the display that convert the electrical image data into light that is emitted by the display (Lo) via an electrical-to-optical (E/O) conversion process. The details of the E/O process are unique to the display technology; e.g., LCD, plasma, OLED, etc. In fact, for the virtually obsolete CRT technology the decoding function ƒd was an integral part of the E/O conversion process.
It is noted in the above discussions that the signals ‘S’ represent normalized values typically ranging from 0 to 1. For the case of voltage signals, the actual signals would be normalized by VMAX such that S=Vactual/VMAX. For the case of digital signals, the signals would be normalized by DMAX such that S=Dactual/DMAX (e.g., for an 8-bit channel DMAX=28=256). The signal normalization process generally requires processing steps that are not explicitly shown in FIG. 2, but are implied herein. As long as normalized signals are consistently used it does not matter whether the signals represent voltage levels or bit levels.
As a specific example of an end-to-end image processing stream, ITU-R BT.709-5 (2002) recommends encoding a television signal with an α value of ≈0.5 (Note: this is a slight simplification of BT.709), while ITU-R BT.1886 (2011) recommends decoding a television signal with a γ value of 2.4, leading to an end-to-end power (c) of 1.2: Sd=Se2.4=(Ss0.5)2.4=Ss(0.5×1.2)=Ss1.2. The signal transformations that occur in the above ITU-defined processes are illustrated in FIG. 3, where the parameters for the horizontal ‘input’ axis and vertical ‘output’ axis depend on the relevant processing step. For example, during the signal encoding operation the horizontal axis would represent Ss (as the input signal) while the vertical axis would represent Se (as the output signal). The implied normalization of signals is evident in FIG. 3 since all signal levels reside between the values of 0 and 1.
It is noted in FIG. 3 that the ITU-R BT.709/1886 signal transformation processes do not strictly adhere to the aforementioned principle of end-to-end luminance linearity since the reproduction system has an end-to-end power law exponent of ε=1.2, rather than ε=1.0 as would be the case for pure linearity. This produces the slight curve of the black line in FIG. 3. The primary reason for deviating from pure linearity is that a camera/display cannot practically reproduce the full dynamic range of light that exists in the real world. As a result, the end-to-end exponent (or power) of ε=1.2 is generally considered to produce a better perceptual experience for in-home television viewing with an average background illumination of ≈200 lux. This is based on reproducing a more life-like image contrast when the human eye is adapted to the typical in-home environment.
However, it is common for movie producers to deviate from ITU-R BT.709 encoding in order to target much darker viewing environments such as theaters with a background illumination of ≈1-10 lux and/or to create artistically-flavored video content. A typical encoding exponent for this application is approximately α=0.60. If this signal is subsequently decoded with a power exponent γ=2.4 then the end-to-end linearity power is ε≈1.45.
Another popular image encoding scheme is the sRGB standard that is intended for image rendering in moderately bright environments such as work offices with a background illumination of ≈350 lux. sRGB calls for a signal encoding exponent approximating α=0.45. If such an sRGB-encoded signal is subsequently decoded with a power exponent γ=2.4 then the end-to-end linearity power is ε≈1.1.
The three different viewing environments discussed above and their suggested end-to-end linearity power exponents can be curve-fitted and used to extrapolate to higher levels of ambient illumination. The trend is given by Eq(1), which is plotted in FIG. 4. Hence, once the ambient illumination level (Ia) has been measured then the desired end-to-end linearity power exponent (ε) can be determined from Eq(1). This relationship between Ia and ε will be germane to certain aspects of the invention as described in following sections. The relationship given by Eq(1) is merely representative, and the invention is not dependent on the exact form of Eq(1). In general, the invention may implement any arbitrary relationship between Ia and ε.ε≅1+0.48·e−(0.0045*la)  Eq(1)
It is noted in FIG. 4 that as the ambient illumination increases that the desired end-to-end power asymptotically approaches purely linear; i.e., ε→1. Above 1000 lux the power ε is essentially equal to 1. This is basically a statement that as the eye becomes adapted to full daylight conditions that the display should start to adhere to the principle of pure end-to-end luminance linearity as was previously discussed. However, few if any displays actually implement a power of ε≈1.
Alternatively, the function described by Eq(1) can be implemented in a discrete fashion, as illustrated in FIG. 5. The number of discretized levels illustrated in FIG. 5 is representative; the invention may implement an arbitrary number of discrete levels.