This invention relates generally to three-dimensional (3D) scene modeling in which the scene is made up from a number of images, and more particularly to automated layer extraction from the images making up such a scene, and automated pixel assignment of the pixels of the images to a particular layer of the layers that have been extracted.
One type of graphics application for computers and other such devices is three-dimensional (3D) scene modeling. Generally, in 3D scene modeling, a 3D scene is modeled from a sequence of two-dimensional (2D) images taken of the scene by cameras placed at various locations around and/or within the scene. This sequence of 2D images allows for the creation of a 3D geometric model of the scene, including in some instances what is known as a texture map that captures the visual appearance of the scene. The texture map is a 2D bitmapped image of the texture of a surface of the 3D scene, such as a uniform texture (e.g., a brick wall), or an irregular texture (e.g., such as wood grain or marble). The texture map is then xe2x80x9cwrapped aroundxe2x80x9d geometric objects within the 3D scene.
In another approach, the sequence of 2D images provides for the creation of the 3D scene by decomposing the scene into a collection of 3D layers, or sprites. Each 3D layer includes a plane equation, a color image that captures the appearance of the sprite, a per-pixel opacity map, and a per-pixel depth-offset relative to the nominal plane of the layer. A generative model for this approachxe2x80x94that is, constructing the 3D layers of the 3D scene from the sequence of 2D imagesxe2x80x94is described in the reference S. Baker, R. Szeliski, and P. Anadan, A layered approach to stereo reconstruction, in IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR ""98), pages 434-441, Santa Barbara, June 1998.
However, the approach of decomposing a 3D scene into 3D layers has disadvantages as the approach is known in the prior art, including as described in the reference of the previous paragraph. First, recovery of the 3D layers from the sequence of 2D images for constructing a model of a 3D scene is accomplished by manual input. Among other things, this means that a user must painstakingly divide a scene into 3D layers, based on at least large part on subjective criteria that may be different to each user. Thus, the decomposition approach is not well suited to automatic rendering of 3D scenes as is the case with texture mapping approaches, and therefore may not be selected as the utilized approach, even if it were to yield better results than other approaches.
Second, assigning pixels of the 2D images to the extracted 3D layers is known to be required, but a complete methodology as to how to assign pixels to layers as a general matter is not known. This means that pixels of images are assigned to layers on an ad hoc basis, as opposed to following a formal methodology, or, better yet, having an automated manner of pixel assignment to layers. This disadvantage of the decomposition approach also militates against the use of the approach to automatic rendering of 3D scenes, such that other approaches, such as texture mapping approaches, may instead be chosen for rendering, even if these alternative approaches yield less optimal results.
For these and other reasons, then, there is a need for the present invention.
The invention relates to automated layer extraction from a number of 2D images that are formed by different cameras viewing a 3D scene from different viewpoints, and the automated assignment of pixels of the images to the extracted layers. As used herein, layers are also referred to as sprites. In one embodiment, a computer-implemented method is operable on a number of 2D images of such a 3D scene, where each 2D image has a number of pixels that correspond to the pixels of the other images. The method determines a number of planes of the scene, and assigns pixels of the images to one of the planes. At least the planes of the scene are then output.
In one embodiment, the method determines the number of layers via a statistical estimation approach that embodies notions of physical coherence of surfaces and objects. These include: (i) that the pixels belonging to the same layer should approximately form a planar region in 3D space (i.e., their combined offsets relative to a plane should be small; (ii) that nearby pixels in an image are likely to belong to the same layer; and, (iii) the image appearance of the different portions of a layer should be similar. The method of this particular embodiment uses Bayesian reasoning techniques, as known within the art, and in so doing embodies the notions of physical coherence in terms of Bayesian xe2x80x9cprior probabilitiesxe2x80x9d regarding the physical description of the scene, and the evidence provided by the images as xe2x80x9clikelihoodsxe2x80x9d associated with the specific layered decomposition of the scene. The xe2x80x9cposterior probabilitiesxe2x80x9d associated with different possible layer decompositions (i.e., the number of layers where the number is between one and some predefined maximum possible value n, and the associated pixel assignments) are evaluated and the most likely decomposition as determined by an estimation algorithm is chosen.
The number of planes can be first determined by using the high-entropy pixels of the images. These are pixels which have distinct image appearance such as comers of regions or highly textured points (as opposed to, for example, areas which are homogeneous in color). Also in one particular embodiment, the method assigns all pixels of the images, other than the high-entropy pixels, to the planes via an iterative Expectation Maximization-type approach based on Bayesian decision criteria.
Embodiments of the invention provide for advantages not offered by the prior art. Foremost, embodiments of the invention provide for an automated manner by which 3D layers are extracted from the 2D images making up the scene, and for an automated manner by which pixels of the images are assigned to these extracted layers. This allows for the layer decomposition approach to 3D scene modeling to be automated, such that it becomes a more attractive approach to such modeling as compared to other approaches that are already automated, such as texture mapping approaches.