The present disclosure relates to an image processing apparatus, an image processing method, and a program, and more particularly to an image processing apparatus, an image processing method, and a program for generating from photographic images an image of which values of multiple photographic parameters are different from values of the photographic image.
A photographic image that is captured with an imaging apparatus while changing a value of a photographic parameter is called epsilon photography (for example, refer to Ramesh Raskar. Computational photography: Epsilon to coded photography. In Emerging Trends in Visual Computing, 2008, pp. 238-253), and has been used as a keyword for the connection between imaging and computer graphics. As the photographic parameter, for example, there are a focus position, an f-stop, a shutter speed, ISO sensitivity, and the like.
In recent years, many technologies have been proposed that perform image processing using a photographic image that is captured while changing a value of one photographic parameter. As such a technology, for example, there is a technology that produces an image in a high dynamic range using an image stack that is made from multiple photographic images which are captured while changing the shutter speed (for example, refer to Paul E. Debevec and Jitendra Malik, Recovering High Dynamic Range Radiance Maps from Photographs. In SIGGRAPH 97, August 1997).
Furthermore, there is a technology that generates a light field using the image stack that is made from multiple photographic images which are captured while changing the focus position (for example, refer to Anat Levin and Frédo Durand, Linear view synthesis using a dimensionality gap light field prior, Conference on Computer Vision and Pattern Recognition (CVPR), 2010). The light field is used in viewpoint conversion as disclosed in Anat Levin and Frédo Durand, Linear view synthesis using a dimensionality gap light field prior, Conference on Computer Vision and Pattern Recognition (CVPR), 2010, or is used when refocus or depth-of-field control is performed (for example, refer to Ng, R., Levoy, M., Br_edif, M., Duval, G., Horowitz, M., and Hanrahan, P. Light field photography with a hand-held plenoptic camera. Tech. report, Stanford University, April 2005).
Moreover, there is a technology called a depth-from-focus (DFF) that estimates a depth value indicating a position in a depth direction, of a photographic subject, using the image stack that is made from multiple photographic images which are captured while changing the focus position (for example, S. K. Nayar and Y. Nakagawa, “Shape from Focus,” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 16, No. 8, pp. 824-831, August 1994 a).
Furthermore, in recent years, technologies have been proposed that perform image processing using a photographic image that is captured while changing values of two photographic parameters. As such a technology, for example, there is a technology that generates a depth map that is high in spatial resolution or reliability compared to DFF and the like, using an image stack that is made from multiple photographic images that are captured while changing a focus position value and an f-stop value (for example, refer to Samuel W. Hasinoff and Kiriakos N. Kutulakos, Confocal Stereo, International Journal of Computer Vision, 81(1), pp. 82-104, 2009).