The present invention relates to apparatus and method for capturing an image of a scene, and, more particularly, to apparatus and method for capturing a high dynamic range image using a low dynamic range image sensor.
Virtually any real world scene produces a very large range of brightness values. In contrast, known image sensing devices have very limited dynamic ranges. For example, it is typical for a video sensor to produce 8-bits or less of grey-level or color information. In the case of grey-scale images, 8-bits provide only 256 discrete grey levels, which is not sufficient to capture the fine details of most real life scenes.
A known solution to the problem of capturing high dynamic range images with a low dynamic range image sensor is to take multiple image measurements for each local scene area while varying the exposure to light from the scene. Such exposure variation is typically accomplished by sequentially taking multiple images of the scene with different exposures and then combining the multiple images into a single high dynamic range image. Temporal exposure variation techniques for enlarging the dynamic range in imaging a scene may be found for example in: U.S. Pat. No. 5,420,635 to M. Konishi et al., issued May 30, 1995; U.S. Pat. No. 5,455,621 to A. Morimura, issued Oct. 3, 1995; U.S. Pat. No. 5,801,773 to E. Ikeda, issued Sep. 1, 1998; U.S. Pat. No. 5,638,118 to K. Takahashi et al., issued Jun. 10, 1997; U.S. Pat. No. 5,309,243 to Y. T. Tsai, issued May 3, 1994; Mann and Picard, “Being ‘Undigitar’ with Digital Cameras: Extending Dynamic Range by Combining Differently Exposed Pictures,” Proceedings of IST's 48th Annual Conference, pp. 422-428, May 1995; Debevec and Malik, “Recording High Dynamic Range Radiance Maps from Photographs,” Proceedings of ACM SIGGRAPH, 1997, pp. 369-378, August 1997; and T. Misunaga and S. Nayar, “Radiometric Self Calibration,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR 99), June 1999. However, techniques that require acquiring multiple images while temporally changing the exposure have the fundamental problem in that variations in the scene may take place between exposure changes. In other words, these techniques are useful only for static scenes where the scene radiance values stay constant. Moreover, between exposure changes the position and orientation of the imaging device and its components must remain constant. Finally, because a greater time is needed to sequentially capture all the required images, the temporal exposure variation techniques are not suitable for real time applications.
Another known solution to the problem of capturing high dynamic range images with a low dynamic range image sensor is to simultaneously capture multiple images of a scene using different exposures. Such a technique is disclosed, for example, in Yamada et al., “Effectiveness of Video Camera Dynamic Range Expansion for Lame Detection,” Proceedings of the IEEE Conference on Intelligent Transportation Systems, 1997. Typically, two optically aligned CCD light-sensing arrays are used to simultaneously capture the same image of a scene with different exposures. Light from the scene is divided by a beam splitter and directed to both CCD light-sensing arrays. The two captured images are combined into one high dynamic range image by a post processor. This technique has the disadvantage of requiring complex and expensive optics, and capturing images with more than two different exposures becomes difficult.
Efforts have been made to increase the dynamic range of charge-couple imaging devices. Published Japanese patent application No. 59,217,358 of M. Murakoshi describes using two or more charge coupled device (CCD) light-sensing cells for each pixel of the imaging device. Each of the light-sensing cells of a pixel have different photo sensitivities so that some cells will take longer to reach saturation than others when exposed to light from a scene. In this manner, when the store photogenerated charge in all of the light-sensing cells of a pixel are combined, the dynamic range of the pixel is effectively increased. However, the Murakoshi reference does not address the problem of capturing high dynamic range images using a low dynamic range image sensor.
U.S. Pat. No. 4,590,367 to J. Ross et al. discloses an arrangement for expanding the dynamic range of optical devices by using an electrically controlled light modulator adjacent to the light-sensitive area of an optical device to reduce the brightness of incident light from an imaged optical scene so as to not exceed the dynamic range or the light-sensitive area. The light modulator of the Ross et al. reference has individual pixel control of light amplification or attenuation in response to individual control signals. The detected level of light intensity emerging from each pixel of the modulator is then used to develop the control signals to adjust the amplification or attenuation of each pixel of the modulator to bring the brightness level into the detector's rated dynamic range. Thus, wide input light intensity dynamic range is reduced to a narrow dynamic range on a pixel-by-pixel basis. However, the apparatus and method of the Ross et al. reference is aimed at simplifying three-dimensional measurement using projected light, and there is nothing in the reference on how to capture a high dynamic range image.
Another problem of known systems for capturing high dynamic range images is the display of such images using low dynamic range displays. Most commercially available displays, such as video monitors, televisions and computer displays, have low dynamic ranges. Hence, after the high dynamic range image is obtained, one needs to use a mapping method to display the image on a low dynamic range display while preserving the pertinent visual information. In other words, the dynamic range of the captured image must be compressed while preserving visual details of the scene.
A known technique for compressing the dynamic range of a captured image is tone curve (or response curve) reproduction, in which a high dynamic range color scale is mapped into a low dynamic range color scale using an appropriate tone curve. Logarithmic scale conversion and gamma correction are examples of this kind of compression. However, tone curve reproduction compression does not preserve visual details of the scene.
A more sophisticated tone curve reproduction technique is histogram equalization, which creates a tone curve by integrating the color histogram of the image. Histogram equalization results in a more even redistribution of the colors in the image over the space of colors, but this technique cannot sufficiently preserve the detail of an image which has several local areas with very bright (or very dark) pixels, where such local areas together may span the complete range of grey levels. Hence, details within individual areas are not enhanced by the histogram equalization process.
Another technique for compressing the dynamic range of a captured image is disclosed in Pattanaik et al., “A Multiscale Model of Adaptation and Spatial Vision for Realistic Image Display,” SIGGRAPH 98 Proceedings, pp. 287-298, 1998. This technique divides the image data into several frequency component images, for example, in a Laplacian pyramid. The image is then re-composed after each frequency component has been modulated in a different manner. An advantage of this technique is that absolute pixel values do not contribute so much, unlike histogram equalization. However, this technique has the drawback of requiring large amounts of memory and extensive computations to obtain the frequency component images.
Accordingly, there exists a need for an apparatus and method for capturing high dynamic range images using a relatively low dynamic range image sensor and for detail preserving compressing the captured high dynamic range image for display by relatively low dynamic range display device, which overcomes the problems of the prior art as discussed above.