Field of the Invention
The invention generally relates to stereo images and, more specifically, to techniques for generating robust stereo images from a pair of corresponding stereo images captured with and without a light source such as a flash device.
Description of the Related Art
Recently, three-dimensional (3D) stereo images and video have gained popularity in the consumer market. The introduction of a broader selection of 3D content along with the marketing of relatively cheap 3D HDTV (high definition television) sets have made viewing images and video in 3D more common. Equipping computers with sets of cameras and computing depths and spatial relations from stereo image pairs has been well-documented, with applications in 3D modeling, robotic navigation, new image synthesis, augmented reality, and gaming among others. Recently, stereo imaging has shown up in applications as common as hand-held video cameras such as the Fuji® Finepix 3D camera and the Sony® Bloggie 3D camera.
Conventionally, applications generate depth maps from captured stereo images using a basic stereo reconstruction algorithm to generate a depth value for each pixel by comparing projections of scene points across two or more images taken from offset locations. Stated in another way, the stereo reconstruction algorithm is essentially a pixel matching operation. The pixel matching is typically performed by minimizing the sum of squares, maximizing pixel correlation, or by applying a rank or census transform and then matching the ranks or bit strings. These algorithms work fairly well with textured surfaces, but the algorithms have difficulty making accurate matches on surfaces with uniform color. In addition, pixel-matching algorithms may fail proximate to occlusion boundaries because the depth discontinuity at the boundaries will cause local pixels to be different across the different images (i.e., the close surface occludes pixels in the background in one image but a different background object in the other and, therefore, those pixels do not have a corresponding match). One good example of where the conventional algorithm may fail is when a picture is taken through a fine mesh such as a wicker chair or a chain link fence.
Accordingly, what is needed in the art is a more effective approach for generating accurate, per-pixel depth maps associated with stereo images.