1. Field of the Invention
The present invention relates in general to stereo reconstruction, and in particular, to a system and method for extracting structure from multiple images of a scene by representing the scene as a group of image layers, including reflection and transparency layers.
2. Related Art
Many natural images contain mixtures of reflected (reflections) and transmitted light (transparencies). Many natural images will typically contain one or both, i.e., contain mixtures of reflected and transmitted light. For example, shiny or glass-like surfaces typically create a reflected image of other surfaces in its immediate environment. Also, surfaces like glass and water are (at least partially) transparent, and hence will transmit the light from the surfaces behind it. Although it should be noted that the transmitted light is usually attenuated to some degree by the glass (or frontal surface), and thus, the notion of partial transparency or “translucency” is more general. However, following common usage in the field, the term “transparency” is used to indicate both complete transparency and translucency.
As such, many natural images are composed of reflected and transmitted images, which are super-imposed on each other. When viewed from a moving camera, these component layer images appear to move relative to each other. Techniques to recover the multiple motions are commonly referred to as multiple motion recovery techniques. The problem of multiple motion recovery and the reflection and transmission of light on surfaces in visual images has been addressed in several physics-based vision studies. Likewise, a number of techniques for recovering multiple motions from image sequences have been developed.
These techniques can recover multiple motions even in the presence of reflections and transparency. A subclass of these techniques also extract the individual component layer image from the input composite sequence, but only in the absence of reflections and transparency (i.e., all the layers are opaque). Although several studies locked onto each component motion, they actually created a “reconstructed” image of each layer through temporal integration and fell short of being a proper extraction of the component layers. This is because the other layers were not fully removed, but rather appeared as blurred streaks.
The detection of transparency in single images has been studied, but these studies do not provide a complete technique for layer extraction from general images. Thus, current and previous systems have not demonstrated how to accurately recover the component images themselves and the extraction of component layers images in the presence of reflections and transparency remains a problem. Therefore, what is needed is an optimal approach to recovering layer images and their associated motions from an arbitrary number of composite images. Also, there is a need for techniques that estimate the component layer images given known motion estimates.