High dynamic range (HDR) scenes often occur in natural settings. Outdoor scenes that include direct sunlight and deep shadows inherently have a large dynamic range. HDR scenes also often occur in surveillance settings, e.g., scenes that include both indoor and outdoor illumination. Common display devices, LCD flat panel displays, CRTs, and color prints, and the human eye have a limited dynamic range: roughly 100:1. In computer graphics, the dynamic range of scene is expressed as the ratio of the highest scene luminance to the lowest scene luminance. Photographers tend to be most interested in the ratio of the highest and lowest luminance regions where detail is visible. This can be seen as a subjective measure of dynamic range. The “key” of a scene indicates whether it is subjectively light (“high-key”), normal (“mid-key”) or dark (low-key'). Dynamic range can also be expressed as the difference between the highest and lowest scene zones, where the zones relate logarithmically to the scene luminances.
Displaying HDR scenes captured with a wide dynamic range (WDR) digital still or video cameras, multiple images with different exposure times, different cameras, or advanced sensor readout techniques (e.g., the ExDRA™, an adaptively-binned sensor, disclosed in U.S. Pat. No. 8,786,732, incorporated herein by reference) requires that there is some process to reduce the dynamic range, i.e., a tone mapping operator (TMO). Many TMOs have been developed to solve the problem of mapping WDR camera output into a lower dynamic range suitable for compression, transmission, and display. The Retinex [1,2] algorithm, for example, relies on pixel values near a pixel of interest to determine a local illumination, and local illumination values are compressed globally while pixel contrast relative to neighboring pixels with the same local illumination is preserved. The basic Retinex method produces some artifacts; most notably, haloes appear around very bright or dark localized regions. There are many other TMOs described in the literature. Reinhard et al. [3] describe a method called “Photographic Tone Reproduction” that was inspired by the “dodging and burning” techniques used by photographers to print HDR images from film. In these techniques, light is either withheld from or added to a portion of the print during development. Tumblin and Turk [4] use a diffusion algorithm to decompose an image into a set of simpler images. This low curvature image simplifier (LCIS) TMO compresses global dynamic range, while at the same time preserves small-scale structure and low-level features. Fattal, et al. [5] have developed a TMO that depends on finite difference estimates of the two-dimensional gradients in the logarithm of the image luminance. The magnitudes of the gradients are reduced for large gradients and preserved for small gradients. The Fattal, Lischinski, and Werman5 TMO produces HDR images for display without significant halo artifacts. This TMO involves the approximate iterative solution of a differential equation. In addition to the methods mentioned here, there are many other note-worthy TMOs.
In spite of the progress in TMOs, no ideal solution exists for processing the output of WDR digital cameras. The system and method disclosed herein relates to a TMO that is somewhat similar to the Reinhard et al. TMO. The TMO described herein includes both global contrast reduction of a HDR image and contrast enhancement within local parts of that image. Each pixel serves as the center of a local region for determining the local level of illumination. The residual luminance, the difference between the image luminance and the local level of illumination, is preserved and enhanced while large global illumination differences are tone mapped to lower levels. Because of this combination of features, we call our operator a TMCEO, Tone Mapping Contrast Enhancement Operator.
A common method to globally tone map a HDR image is to apply a sigmoid-shaped (e.g., an inverse tangent) function to the luminance channel of the image. A simple operator that does this is:
                                                        L              d                        ⁡                          (                              x                ,                y                            )                                =                                    L              ⁡                              (                                  x                  ,                  y                                )                                                    1              +                              L                ⁡                                  (                                      x                    ,                    y                                    )                                                                    ,                            (        1        )            where Ld(x,y) is the luminance used for display and L(x,y) is the luminance channel of the HDR image scaled such that a mid-level gray in the image is mapped to a value based on the image content. Reinhard et al. [3] scale the log-average luminance (average of the logarithm of the image expressed as a linear luminance—note that the Reinhard et al. paper has a formula for this, but it is incorrect) to 0.18 on a scale of zero to one for a scene that contains a “normal” mix of high and low luminance values. The global tone map described above over-compresses the contrast in high luminance regions resulting in a loss of detail. The solution to this problem is to apply a local TMO. The TMO developed by Reinhard et al. uses an estimate of the local illumination given by a locally averaged luminance determined over an adaptive size scale. The locally averaged luminance is used to remap the image luminance to a smaller global range for display by an equation that is similar to a sigmoid function:
                                                        L              d                        ⁡                          (                              x                ,                y                            )                                =                                    L              ⁡                              (                                  x                  ,                  y                                )                                                    1              +                                                V                  1                                ⁡                                  (                                      x                    ,                    y                    ,                                                                  s                        m                                            ⁡                                              (                                                  x                          ,                          y                                                )                                                                              )                                                                    ,                            (        2        )            where V1(x,y,sm(x,y)) is the locally averaged luminance. V1(x,y,sm(x,y)) is determined by the convolution of the image luminance and a normalized Gaussian with a location dependent size parameter sm(x,y).
The size parameter, sm(x,y), in the Reinhard et al. TMO, is found essentially by a difference-of-Gaussians method. The difference-of-Gaussians function used in this TMO (referred to by Reinhard et al. as the “center surround function”) is:
                                          V            ⁡                          (                              x                ,                y                ,                s                            )                                =                                                                      V                  1                                ⁡                                  (                                      x                    ,                    y                    ,                    s                                    )                                            -                                                V                  2                                ⁡                                  (                                      x                    ,                    y                    ,                    s                                    )                                                                                                      2                  ϕ                                ⁢                                  a                  /                                      s                    2                                                              +                                                V                  1                                ⁡                                  (                                      x                    ,                    y                    ,                    s                                    )                                                                    ,                            (        3        )            where V1(x,y,s) is the convolution of the image luminance and a Gaussian kernel function centered at pixel location (x,y) in the image, s is the size scale of the Gaussian kernel (the radius to the two-sigma point), ϕ is a parameter to control the center surround function, V(x,y,s), at small values of V1(x,y,s), and a is the log-average luminance of the scaled image. The value of the parameter a is referred to as the “key value” because it relates to the key of the image after scaling is applied. V2(x,y,s) is also a Gaussian weighted average, the convolution of the image luminance and a Gaussian kernel, just like V1(x,y,s), except that the size parameter used to calculate the kernel in V2(x,y,s) is scaled to the next larger Gaussian in a discrete set of “nested” Gaussians, spaced logarithmically in size. For the examples of their TMO processed images presented in Reinhard et al. [3], they use eight size scales, and each successive value is 1.6 times the previous size scale. The local scale is found by finding the scale, sm(x,y), in this small set of possible values, that satisfies both:|V(x,y,sk(x,y))|<ε∀k such that k<m and |V(x,y,sm(x,y))|≥ε,  (4)where ε is a small parameter (e.g., 0.05). (It should be noted that this description has been modified from that presented in Reinhard et al. because there are some mistakes in that description.) Pixels within a circle of radius equal to roughly the scale of the Gaussian used for V1(x,y,sm(x,y)), i.e., sm(x,y), are within the region of approximately constant illumination. The next larger scale, the scale used in V2(x,y,sm(x,y)), i.e., 1.6×sm(x,y), describes a larger circle that reaches regions with a higher or lower illumination—this causes the absolute difference of Gaussians to increase and the second inequality in Equation 4 is satisfied. Because the kernels used for convolution and smoothing are Gaussians, their support extends to large distances from the center. The regions or zones of approximately uniform illumination have a size that depends also on the illumination differences between adjacent regions. But, in general, the Gaussian weighted average luminance (the local illumination function) is determined over a local region of a size that depends on the distance to the closest nearby region with a different illumination. For a HDR image, regions with small local scales dominate near a large step in illumination. Large areas of uniform texture have large local scales. The nature of the local illumination function, a size-adaptive, locally-averaged luminance, and the TMO mapping function developed by Reinhard et al. greatly reduce the creation of halo artifacts.
The Reinhard et al. TMO works reasonably well for many HDR images, but does not work particularly well for the dim regions of high key (mostly bright) images. The lowest values are mapped to values so close to zero that, in these kinds of images, a human cannot pick out much detail in the darker regions. Accordingly, the need remains for a TMO mapping function and a contrast enhancement step that can be usefully combined to increase the visible detail in dark parts of HDR images.