The present invention relates to the generation of virtual environments for interactive walkthrough applications, and more particularly, to a system and process for capturing a complex real-world environment and reconstructing a 4D plenoptic function supporting interactive walkthrough applications.
Interactive walkthroughs are types of computer graphics applications, where an observer moves within a virtual environment. Interactive walkthroughs require detailed three-dimensional (3D) models of an environment. Traditionally, computer-aided design (CAD) systems are used to create the environment by specifying the geometry and material properties. Using a lighting model, the walkthrough application can render the environment from any vantage point. However, such conventional modeling techniques are very time consuming. Further, these techniques fail to adequately recreate the detailed geometry and lighting effects found in most real-world scenes.
Computer vision techniques attempt to create real-world models by automatically deriving the geometric and photometric properties from photographic images of the real-world objects. These techniques are often based on the process of establishing correspondences (e.g., stereo matching), a process that is inherently error prone. Further, stereo matching is not always reliable for matching a sufficient number of features from many images, in order to create detailed models of complex scenes.
Image-based rendering (IBR) has been used to create novel views of an environment directly from a set of existing images. For each new view, IBR reconstructs a continuous representation of a plenoptic function from a set of discrete image samples, thus avoiding the need to create an explicit geometric model.
A seven-dimensional (7D) plenoptic function was introduced by E. H. Adelson and J. Bergen in xe2x80x9cThe Plenoptic Function and the Elements of Early Vision,xe2x80x9d Computational Models of Visual Processuig, MIT Press, Cambridge, Mass., 3-20, 1991, which is hereby incorporated by reference in its entirety. The 7D plenoptic function describes the light intensity as viewed according to the following dimensions: viewpoint (quantified in three dimensions); direction (two dimensions); time; and wavelength. By restricting the problem to static scenes captured at one point in time, at fixed wavelengths (e.g., red, green, and blue), Adelson and Bergen""s 7D plenoptic function can be reduced to five dimensions. In practice, all IBR techniques generate a plenoptic function, which is a subset of the complete 7D plenoptic function.
A 2D plenoptic function can be obtained by fixing the viewpoint and allowing only the viewing direction and zoom factor to change. Further, many 2D plenoptic function examples exist of IBR methods that stitch together both cylindrical and spherical panoramas from multiple images. Two such examples are described in the following articles: Chen S. E, xe2x80x9cQuicktime VRxe2x80x94An Image-Based Approach to Virtual Environment Navigationxe2x80x9d, Computer Graphics (SIGGRAPH ""95), pp. 29-38, 1995; and R. Szeliski and H. Shum, xe2x80x9cCreating full view panoramic image mosaics and texture-mapped modelsxe2x80x9d, Computer Graphics (SIGGRAPH ""97), pp. 251-258, 1997.
A method using concentric mosaics, which is disclosed in xe2x80x9cRendering with Concentric Mosaicsxe2x80x9d by H. Shum and L. He, Computer Graphics (SIGGRAPH ""99), pp. 299-306, 1999, captures an inside-looking-out 3D plenoptic function by mechanically constraining camera motion to planar concentric circles. Subsequently, novel images are reconstructed from viewpoints, restricted within the circle and having horizontal-only parallax. Similar to the above-mentioned 2D plenoptic modeling, the viewpoint is severely restricted in this method.
The Lumigraph and Lightfield techniques reconstruct a 4D plenoptic function for unobstructed spaces, where either the scene or the viewpoint is roughly constrained to a box. The Luinigraph technique is described in the paper, xe2x80x9cThe Lumigraph,xe2x80x9d by S. Gortler et al. Computer Graphics (SIGGRAPH ""96), pp. 43-54, 1996, and in U.S. Pat. No. 6,023,523, issued Feb. 8, 2000. The Lightfield technique is disclosed in a paper entitled xe2x80x9cLight Field Rendering,xe2x80x9d by M. Levoy and P. Hanrahan, Computer Graphics (SIGGRAPH ""96), pp. 171-80, 1996, and in U.S. Pat. No. 6,097,394, issued Aug. 1, 2000. Each of these methods captures a large number of images from known positions in the environment, and creates a 4D database of light rays. The recorded light rays are retrieved from the database when a new viewpoint is rendered. These Lumigraph and Lightfield methods of IBR allow for large models to be stitched together, but their capture is very time-consuming, and the models, so far, are restricted to small regular areas.
A technique for reconstructing a 5D plenoptic function, using images supplemented with depth values, has been described by L. McMilan and G. Bishop in xe2x80x9cPlenoptic Modeling: An Image-Based Rendering System,xe2x80x9d Computer Graphics (SIGGRAPH ""95), pp. 39-46 1995. McMillan and Bishop""s technique formulates an efficient image warping operation that uses a reference image to create images for a small nearby viewing area. For real-world environments, depth is computed by manually establishing feature correspondences between two cylindrical projections captured with a small baseline. In this method, expansion to larger environments involves sampling many images from pre-set, closely-spaced viewpoints. Other IBR techniques using 5D functions are also known in the art.
The full 5D plenoptic function, in theory, can reconstruct large, complex environments. However, such 5D representations require the difficult task of recovering depth, which is often error prone and not robust enough to reconstruct detailed complex scenes.
The present invention provides an IBR system and method for performing xe2x80x9cPlenoptic Stitching,xe2x80x9d which reconstructs a 4D plenoptic function suitable for walkthrough applications. The present invention reduces the 7D plenoptic function to a 4D function by fixing time, wavelength, and restricting viewpoints to lie within a common plane.
In the present invention, the set of images required for rendering a large, complex real-world environment can be captured very quickly, usually in a manner of minutes. Further, after initialization of the system, all of the processing is performed automatically, and does not require user intervention.
According to an exemplary embodiment of the present invention, images of the environment are acquired by moving a video camera along several intersecting paths. The motion of the camera is restricted to a plane. The intersecting paths form an irregular grid consisting of a group of tiled image loops.
At run-time, a virtual user moves within any of these image loops, and the Plenoptic Stitching process of the invention generates an image representing the user""s view of the environment. The image is generated by xe2x80x9cstitchingxe2x80x9d together pixel data from images captured along the boundaries of the loop.
The user is able to move freely from one image loop to the next, thereby exploring the environment. By tiling an environment with relatively constant-sized image loops, an environment of any size and shape can be reconstructed using a memory footprint whose size remains approximately constant throughout processing.