In computer graphics, it is often desired to minimize the amount of geometry in a model of a scene or an object to enable efficient rendering of the model. Several effective approaches have been developed to add visual detail to low-resolution models during rendering, such as texture mapping and bump mapping, see Apodaca et al., “Advanced Renderman,” Morgan Kaufmann, ISBN 1558606181, 2000.
However, there are times when low-resolution models are insufficient. For example, high-end production studios often require models with detailed explicit geometry for physical simulation, e.g., deformation and collision detection. In addition, these studios often employ sophisticated illumination that requires models with high-resolution geometry.
Displacement mapping can be applied to the low-resolution geometry of an underlying model to provide correct illumination. This is an operation that is usually performed dynamically during rendering and, therefore, precludes using the resultant model with high-resolution geometry for physical simulation. Finally, users, such as artists, designers, engineers and sculptors, may require models with high-resolution geometry in order to produce solid 3D models via 3D printing methods.
Many systems are known for direct modeling of the 3D geometry of scenes and objects. However, generating models with high-resolution geometry is a difficult and time-consuming task. It is often very hard to recreate the complexity and variety of geometric texture that occurs in nature.
High-resolution range scanners, such as Cyberware 3030, provide means for capturing existing geometry but high-resolution scanners are expensive and difficult to transport. In addition, their spatial resolution is limited. Hand-held range scanners are more portable, but they too are expensive for the casual user, and sacrifice both spatial and depth resolution for portability.
In contrast, digital cameras are portable, inexpensive, have a high spatial resolution, and are easy to use. In addition, 2D photograph editing systems such as Photoshop are dramatically simpler to use than 3D modeling systems. However, digital cameras do not provide explicit depth information.
Methods for generating the geometry for 3D models from 2D images have a significant connection to the field of computer vision. Many methods are known in the prior art for extracting shape from shade, shape from focus, and shape from stereo pairs. Szeliski, in “Determining Geometry from Images”, SIGGRAPH 1999 Course Notes #39, Image-Based Modeling, Rendering, and Lighting, 1999, presents a bibliography and an overview of the various approaches.
Prior work has primarily focused on developing automatic techniques for acquiring an accurate global shape description of objects or scenes. In contrast, it is desired here to capture the spirit of the geometry in a scene using interactive methods by capturing fine geometric detail from a 2D image. Then, a user actively involved in the process can modify and enhance a global shape description of objects or scenes. Thus, the goal of the present invention is quite different from the goal of methods in computer vision.
Although texture synthesis methods, such as described by Efros et al., “Image Quilting for Texture Synthesis and Transfer,” SIGGRAPH Proceedings, pp. 341–346, 2001, can be extended to generate synthetic range images, those techniques lack “directability.” Directability is a phrase often used in the animation industry for processes that provide precise control over every detail.
The basic prior art approach known for constructing 3D models from range data is shown in FIG. 1. A range scanner 110 acquires range data 102 of a scene or object 101. Hereinafter, the term “scene” 101 means a natural outdoor scene, an indoor scene, or a scene that contains one or more objects, or combinations thereof. Of particular interest are highly textured scenes, for example, a rocky surface, leaves, grass, and the like, and objects with uneven and complex surface structures. The range data 102 can be processed 120 to form range images 103 and range 3D surfaces 104. A method for reconstructing the geometry 130 is used to generate a 3D model 105 from the range images 103 and range surfaces 104.
There are many reconstruction methods in the prior art. A review of these methods is described by Curless, “From range scans to 3D models”, Computer Graphics, Volume 33, No. 4, 1999. Some methods first determine an implicit representation of the surface, usually in the form of a sampled distance field, and then reconstruct the 3D model as a 3D iso-surface of the implicit representation. Some methods are designed to be very general, e.g., they can accept range data in the form of an unorganized cloud of surface points. Other methods use range data that are available in the form of range images, where range measurements are acquired in a regularly sampled 2D grid.
There are several methods for reconstructing 3D models from range data that make use of distance fields. Some of these methods make the general assumption that data are available only as an unorganized set of surface points. Hoppe et al., in “Surface Reconstruction from Unorganized Points,” Proceedings SIGGRAPH'92, pp. 71–78, 1992, generates a regularly sampled signed distance volume by defining local tangential planes from neighborhoods of scanned surface points and computing signed distances to these planes. Marching Cubes, described by Lorensen et al., in “Marching Cubes: a High Resolution 3D Surface Reconstruction Algorithm,” Proceedings SIGGRAPH'87, pp. 163–169, 1987, is then used to generate a surface model from the volume representation.
Bajaj et al. in “Automatic Reconstruction of Surfaces and Scalar Fields from 3D Scans,” Proceedings SIGGRAPH'95, pp. 109–118, 1995, and Boissonnat et al., in “Smooth Surface Reconstruction via Natural Neighbor Interpolation of Distance Functions,” in Proceedings of the 16th Annual ACM Symposium on Computational Geometry, pp. 223–232, 2000, build Voronoi diagrams from scanned surface points. Then, they use the Voronoi diagram to efficiently evaluate closest distances to the surface and to define surface patches for the model.
Carr et al., in “Reconstruction and Representation of 3D Objects with Radial Basis Functions”, Proceedings SIGGRAPH2001, pp. 67–76, 2001, fit a radial basis function to a set of on-surface and off-surface points derived from scanned surface points. The on-surface points are assigned a value of zero, while off-surface points constructed from the on-surface points are assigned a value equal to their assigned distance from the surface.
All of these methods are quite general because they can be applied to a set of unorganized points. However, when range data are available in the form of range images, it is desired to determine a distance field directly from the range images.
Curless et al., in “A Volumetric Method for Building Complex Models from Range Images,” Proceedings SIGGRAPH'96, pp. 303–312, 1996, Hilton et al., in “Reliable Surface Reconstruction from Multiple Range Images,” Proceedings of the 4th Eurographics Conference on Computer Vision, pp. 117–126, 1996, and Wheeler et al., in “Consensus surfaces for Modeling 3D Objects from Multiple Range Images,” Proceedings of the International Conference of Computer Vision, 1998, present methods that generate a volumetric representation of the distance field from range surfaces, which are generated by connecting nearest neighbors in the range image with triangular facets.
Those methods avoid triangulation over possible occlusions in the model surface by not connecting neighbors with significant differences in range values. That approach is conservative and avoids building surfaces over unobserved regions. However, that method can lead to holes in the model that must be addressed separately as described by Curless et al. Those three methods all use a weighted averaging scheme to combine distance values from multiple scans. As for the method of Hoppe et al., those methods use Marching Cubes to generate a triangle model from the volume representation.
Curless et al. use line-of-sight distances and only compute distances in a limited shell surrounding the surface. The distance volume is run-length-encoded to reduce storage and processing times. Hilton et al. determine Euclidean distances from range surfaces in a limited shell surrounding the surface, and store the results in a regularly sampled volume. Wheeler et al. also determine Euclidean distances from range surfaces, but limit distance evaluations to the vertices of a three-color octree.
Whitaker, in “A Level-Set Approach to 3D Reconstruction from Range Data,” the International Journal of Computer Vision, pp. 203–231, 1998, determines line-of-sight distances directly from range images and combines distance values from multiple scans using a windowed, weighted average. Then, he uses level set methods to reduce scanner noise by evolving a surface subject to forces that attract the surface to the zero-valued iso-surface of the distance field, and satisfy a shape prior such as surface smoothness. Zhao et al., in “Fast Surface Reconstruction using the Level Set Method,” Proceedings 1st IEEE Workshop on Variational and Level Set Methods, pp. 194–202, 1998, use a method similar to Whitaker, but initialize the distance field used to attract the evolving surface from a set of unorganized points.
Recently Perry et al. in “Kizamu: A System for Sculpting Digital Characters,” Proceedings SIGGRAPH 2001, pp. 47–56, 2001 and Sagawa et al., in “Robust and Adaptive Integration of Multiple Range Images with Photometric Attributes,” Proceedings IEEE Computer Society Conference on Computer Vision and Pattern Recognition, volume 2, pp. 172–179, 2001, describe methods similar to the method of Wheeler et al., but use adaptively sampled distance fields (ADFs) instead of a three-color octree to reduce the number of distance evaluations required.
ADFs adaptively sample a distance field of a scene or object and store the sample values in a spatial hierarchy, e.g., an octree, for fast processing, see Frisken et al. “Adaptively sampled distance fields: a general representation of shape for computer graphics,” Proceedings SIGGRAPH 2000, pp.249–254, 2000. ADFs are memory efficient and detail directed, thus permitting very complex objects to be manipulated on desktop machines. In addition, ADFs are a volumetric representation that can be used to build upon volumetric approaches for reconstructing geometry from range data.
ADFs are described in detail in U.S. patent application Ser. No. 09/370,091, “Detail directed hierarchical distance fields,” filed by Frisken at al. on Aug. 6, 1999, incorporated herein by reference. ADF models generated using the present invention can be incorporated into an existing ADF sculpting system that provides an intuitive interface for manually editing the generated ADF, see U.S. patent application Ser. No. 09/810,261, “System and method for sculpting digital models,” filed by Perry et al., on Mar. 16, 2001, incorporated herein by reference, and for creating level-of-detail (LOD) triangle models from the ADF, see U.S. patent application Ser. No. 09/810,830, “Conversion of adaptively sampled distance fields to triangles,” filed by Frisken et al., on Mar. 16, 2001, incorporated herein by reference.
There also exist several methods for generating 3D models from height fields or elevation maps that are related to the reconstruction of geometry from a single range image, H. Hoppe, “Smooth View-Dependent Level-of-Detail Control and its Application to Terrain Rendering,” IEEE Visualization, pp. 35–42, October, 1998. Those methods are focused on providing efficient rendering and effective visualization, but not towards subsequent editing, as desired here.
Therefore, it is desired to combine the advantages of inexpensive digital cameras and 2D editing systems to provide a simple, fast, and cost-effective method for generating the geometry and detailed texture for 3D models directly from 2D images.