The present disclosure relates to an image processing apparatus, an image pickup apparatus, an image processing method, and a program. In particular, the present disclosure relates to an image processing apparatus, an image pickup apparatus, an image processing method, and a program that generate images with a high dynamic range (wide dynamic range).
A solid-state image pickup element such as a CCD image sensor or a CMOS (Complementary Metal Oxide Semiconductor) image sensor used in a video camera or a digital still camera carries out photoelectric conversion by accumulating charge in keeping with the amount of incident light and outputting an electrical signal corresponding to the accumulated charge. However, there is a limit on the amount of charge that can be accumulated in a photoelectric conversion element, so that when a certain amount of light has been received, a saturation level is reached, resulting in regions of a subject with a certain brightness or higher being set at a saturation luminance level, a problem referred to as “blown out highlights” or “clipping”.
To prevent clipping, processing is carried out to control the charge accumulation period of the photoelectric conversion element in accordance with the luminance or the like of the subject to adjust the exposure length and thereby optimize sensitivity. For example, by using a high shutter speed to shorten the exposure length for a bright subject, the charge accumulation period of the photoelectric conversion element is reduced and an electrical signal is outputted before the amount of accumulated charge reaches the saturation level. By carrying out such processing, it is possible to output an image in which tones are correctly reproduced for the subject.
However, if a high shutter speed is used when photographing a subject in which both bright and dark regions are present, the exposure length will not be sufficient for the dark regions, which will result in deterioration in the S/N ratio and a fall in image quality. To correctly reproduce the luminance levels of bright regions and dark regions in a photographed image of a subject that includes both bright and dark regions, it is necessary to use a long exposure for pixels on the image sensor where there is little incident light to achieve a high S/N ratio and to carry out processing to avoid saturation for pixels with large amounts of incident light.
One known method of realizing such processing is to consecutively pick up a plurality of images with different exposure lengths and then combine such images. That is, a long-exposure image and a short-exposure image are separately and consecutively picked up and a combining process that uses the long-exposure image for dark image regions and the short-exposure image for bright image regions where clipping would occur for the long-exposure image is carried out to produce a single image. In this way, by combining a plurality of images with different exposures, it is possible to produce images with a high dynamic range with no clipping.
The above type of photography with a high dynamic range is referred to as “HDR” (High Dynamic Range) or “WDR” (Wide Dynamic Range) photography.
A number of existing technologies for realizing HDR photography will now be described.
As described in Japanese Laid-Open Patent Publication Nos. H02-174470, H07-95481, and H11-75118 and in Orly Yadid-Pecht and in Eric R. Fossum, “Wide Intrascene Dynamic Range CMOS APS Using Dual Sampling”, IEEE Transactions On Electron Devices, Vol. 44-10, pp. 1721-1723, 1997, for example, one method of generating an HDR image is to pick up a plurality of images with different sensitivities and then combine such images. An example of the configuration and the processing of an image pickup apparatus that uses this method will now be described with reference to FIGS. 1 and 2.
Incident light inputted into an image sensor (image pickup element) 102 via a lens 101 of the image pickup apparatus shown in FIG. 1 is subjected to photoelectric conversion to output a sensor image 103. The sensor image 103 is stored in a frame memory 104. During image pickup, the image pickup apparatus consecutively picks up two images, a high sensitivity image 105 produced by a long exposure and a low sensitivity image 106 produced by a short exposure, stores the two images in the frame memory 104, and inputs the two images into an HDR processing unit 107 located downstream.
The HDR processing unit 107 combines the high sensitivity image 105 produced by the long exposure and the low sensitivity image 106 produced by the short exposure to generate a single HDR image 108. After this, a camera signal processing unit 109 subjects the HDR image 108 to the signal processing carried out in a typical camera, such as white balance adjustment, gamma correction, and a demosaicing process, to generate an output image 110.
The processing sequence of such processing will now be described with reference to FIG. 2. In FIG. 2, the generation timing of the various images listed below is shown on a time axis that advances from left to right.
(a) Output timing of sensor images 103
(b) Output timing of low sensitivity images 106
(c) Output timing of high sensitivity images 105
(d) Output timing of HDR images 108
At time t1, a low sensitivity image#1 is picked up and outputted from the image sensor 102. At time t2, a high sensitivity image#2 is picked up and outputted from the image sensor 102. After this, at t3, t4, . . . , low sensitivity images and high sensitivity images are alternately picked up.
At time t3, the low sensitivity image#1 and the high sensitivity image#2 that have been picked up are outputted from the frame memory 104 to the HDR processing unit 107, and by carrying out a combining process for the two images, a single HDR image “#1, #2” is generated. After this, at time t5, the low sensitivity image#3 and the high sensitivity image#4 that have been picked up are outputted from the frame memory 104 to the HDR processing unit 107, and by carrying out a combining process for the two images, a single HDR image “#3, #4” is generated.
In this way, a low sensitivity image with a short exposure and a high sensitivity image with a long exposure are picked up in alternate frames, images are accumulated in the frame memory, and an HDR image is generated by signal processing. One problem with this method is that since the two images to be combined are picked up at slightly different timing, false colors and double images can be produced when the subject moves.
As another problem, since it is necessary to combine a plurality of images, for video, the frame rate after image combining is lower than the frame rate of the sensor. In the example shown in FIG. 2, since an HDR image is generated from two images, the frame rate for HDR images is half the frame rate of the sensor.
Putting this another way, to output HDR images with the same frame rate as before, the image sensor needs to be driven at twice the speed, which results in an increase in cost and/or an increase in power consumption.
Other methods of generating an HDR image that differ to the method of combining two picked-up images described above are the configurations disclosed for example in Japanese Laid-Open Patent Publication No. 2006-253876, Japanese Patent Publication No. 2006-542337 (Japanese Patent No. 4689620) and Jenwei Gu et al, “Coded Rolling Shutter Photography: Flexible Space-Time Sampling”, Computational Photography (ICCP), 2010. Instead of using a long exposure image and a short exposure image that are picked up consecutively, such methods generate an HDR image based on a single picked-up image.
One example of where the exposure time of the image pickup element is set differently in pixel units is when image pickup is carried out by setting long exposure pixels and short exposure pixels in a single picked-up image. The pixel values of the long exposure pixels and the pixel values of the short exposure pixels included in such single picked-up image are used to generate a single HDR image. Examples of the configuration and the processing of an image pickup apparatus that uses this method will now be described with reference to FIGS. 3 to 5.
Incident light inputted into an image sensor (image pickup element) 112 via a lens 111 of the image pickup apparatus shown in FIG. 3 is subjected to photoelectric conversion to output a sensor image 113. The exposure time of the image sensor (image pickup element) 112 is controlled in pixel units according to control by a control unit, not shown, to set long exposure pixels and short exposure pixels in a single image. The sensor image 113 is inputted into an HDR processing unit 114.
The HDR processing unit 114 combines high sensitivity pixels, which are the long exposure pixels and low sensitivity pixels, which are the short exposure pixels, included in the single picked-up image to generate a single HDR image 115. As a specific example, a pixel value combining process is carried out where the pixel values of high sensitivity pixels are selectively used for pixels where the high sensitivity pixels that are the long exposure pixels are not saturated, and the pixel values of low sensitivity pixels in the vicinity are used for pixels where the high sensitivity pixels are saturated. After this, a camera signal processing unit 116 subjects the HDR image 115 to the signal processing carried out in a typical camera, such as white balance adjustment, gamma correction, and a demosaicing process, to generate an output image 117.
Note that as shown in FIG. 4 for example, low sensitivity pixels and high sensitivity pixels are disposed on the image sensor 112 in a repeating pattern. With this setting, it is possible to generate an HDR image from a single image without using two images.
FIG. 5 shows a sequence of such processing. In FIG. 5, the generation timing of the various images listed below is shown on a time axis that advances from left to right.
(a) Output timing of sensor images 113
(b) Output timing of HDR images 115
At time t1, the sensor image #1 is outputted from the image sensor 112. At time t2, the sensor image #2 is outputted from the image sensor 112. After this, at t3, t4, . . . , sensor images are successively outputted. The respective sensor images are images in which low sensitivity pixels and high sensitivity pixels are set.
The sensor image 113 is inputted immediately into the HDR processing unit 114 and the HDR image 115 is generated with almost no delay. As a specific example, a signal is transferred in line units, and therefore only a delay equivalent to the signal of one line is generated. Unlike the configuration described earlier, this configuration is capable of generating an HDR image from one picked-up image frame. Accordingly, a frame memory such as that shown in FIG. 3 is no longer needed and there is also no drop in the frame rate.
However, this method has the following problem. For example, when the subject of image pickup is bright, the high sensitivity pixels included in the single picked-up image will be saturated and it will be necessary to generate an image using only the low sensitivity pixels. Conversely, when the subject is dark, a large amount of noise will be included in the low sensitivity pixels and it will be necessary to generate an image using the pixel information of only the high sensitivity pixels. In this way, there are cases where valid pixel information cannot be obtained due to the state of the image pickup subject, and as a result, there is the problem of a drop in image quality, such as a drop in resolution and/or production of false colors due to the inability to obtain color samples.