There is continued interest in the efficient and speedy creation by computer of accurate 3D models of objects and other 3D surfaces (hereinafter “objects”). Computer-generated 3D models of objects have useful application in many fields, such as digital imaging, computer animation, special effects in film, prototype imaging in marketing and product development, topography, reconstructive and plastic surgery, dentistry, architecture, industrial design, anthropology, milling and object production, biology and internal medicine.
In addition, with the explosion of usage of the Internet and the World Wide Web, there is a real demand for computer-generated 3D models in the display and marketing of products on Web sites. For such Web sites, 3D object modeling systems facilitate the construction of complex, interactive and animated displays, such as those created by simulators and other user choice based programs. Although 2D image generation systems currently predominate in the display and manipulation of graphic images on the World Wide Web, the use of 3D object models is perceived by some as a more efficient way to present graphic information for interactive graphics, animated special effects and other applications. The use of such 3D object modeling systems is growing in Web-based and other applications.
A 3D object modeling system typically constructs an object model from 3D spatial data and then associates color or other data (called “texture data”) with the specific areas of the model (such texture data is used to render displays or images of the object). Spatial data includes the 3D X, Y, Z coordinates that describe the physical dimensions, contours and features of the object. Existing systems that collect 3D spatial and texture data include both scanning systems and photographic “silhouette” capturing systems. A scanning system uses a light source (such as a laser) to scan a real-world object and a data registration device (such as a video camera) to collect images of the scanning light as it reflects from the object.
A silhouette capturing system typically places an object against a background and then, using a data registration device (such as a digital camera), captures images of the object from different view points. The silhouette capturing system later processes each captured image to obtain a set of “silhouettes” which describe the contours of the object. Each digitized image from the camera contains a set of pixel assignments which describe the captured image. The silhouette capturing system attempts to identify those pixels within each captured image which make up the contours of the object.
For example, a silhouette capturing system typically uses those pixels within each image which form a boundary or outside edge (the “boundary points”) for creating a silhouette contour of the object. The boundary point-based silhouette contours made from one image can be combined with the boundary point-based silhouette contours found in other images to determine a set of 3D X, Y, Z coordinates which describe the spatial dimensions of the object's surface. One typical approach begins with a cube of, for example, 1000×1000×1000 pixels. Using this approach, the shape of the object is “carved” from the cube using silhouette outlines that are obtained from each silhouette image. Silhouette capturing systems can gather enough raw data from the silhouette contours to generate several hundred thousand 3D X, Y, Z coordinates for a full wraparound view of an object.
A typical 3D object modeling system uses the generated 3D X, Y, Z coordinates to create a “wire-frame” model that describes the surface of the object and represents it as a series of interconnected planar shapes (sometimes called “geometric primitives” or “faces”), such as a mesh of triangles, quadrangles or more complex polygons. Typical 3D object modeling systems use the 3D X, Y, Z coordinates either indirectly, in gridded mesh models, or directly, in irregular mesh models.
Gridded mesh models superimpose a grid structure as the basic framework for the model surface. The computer connects the grid points to form even-sized geometric shapes that fit within the overall grid structure. While gridded models provide regular, predictable structures, they are not well-suited for mesh constructions based on an irregular set of data points, such as those generated through laser scanning or silhouette capture. The need to interpolate an irregular set of data points into a regular grid structure increases computation time and decreases the overall accuracy of the model.
Hence, some 3D modeling systems for real-world objects create an irregular mesh model, such as an irregular triangulated mesh, to represent the real-world object. An irregular mesh model imposes no grid structure upon the model. Instead, the 3D X, Y, Z data points are used directly as the vertices in each planar shape or “face” of the mesh.
In addition to using spatial data, 3D object modeling systems also include texture data as a part of the object model. Texture data is color and pattern information that replicates an object's surface features. Some 3D object modeling systems maintain texture data separately from the “wire-frame” mesh data and apply the texture data to the mesh only when rendering the surface features. Those object modeling systems typically to distinct and separate processes: first, in a mesh building phase, the system constructs a “wire frame” mesh to represent the object's spatial structure using only 3D X, Y, Z values (and other related spatial information); and second, during a “texture map” building phase, the system assigns texture data to each of the faces of the mesh model so that when the model is later rendered, the displaying device can overlay texture data on the geometric faces of the model. The rough face of a brick, the smooth and reflective surface of a mirror and the details of a product label can all be overlaid onto a mesh wire frame model using texture mapping principles.
For models of real-world objects, texture data typically comes from 2D photographic images. The 3D spatial coordinate values of a mesh model face can be related and linked to specific points (i.e. two-dimensional x, y pixel locations) in the digitized versions of the collected photo images. Commercially available digital cameras output image frames, each of which includes a 2D matrix of pixels (e.g. 640×480 pixels in dimension). Each pixel in the matrix has, for example, a three-byte (24 bit) red, green and blue (R, G, B) color assignment. Such a 3D object modeling system will then store each photographic image for later use (such as in TIFF format). The 3D object modeling system links each mesh face in the generated 3D mesh model to a specific area in a selected image that contains the appropriate texture data. When showing a view of the 3D model, a displaying device clips relevant areas of the appropriate photo image file and overlays the clip on the associated mesh face.
The current effort in computer graphics to incorporate more images of real-life objects into applications has fostered a search to find improvements in collecting and processing 3D spatial data and texture data. As scanning systems typically require the use of specialized lighting equipment (such as a laser), some have perceived the systems based on silhouette capture as being more convenient to use and more readily adaptable to the current practices of the model designers and other professionals who currently produce and use 3D object models. Thus, there is interest in improving the those 3D modeling systems which use silhouette capture as their means of acquiring 3D spatial data.
In general, some currently available 3D modeling systems which use silhouette capture place an object in a specially colored environment, such as an all green background, and then collect a series of images of the object's shape and texture by either moving the camera around the object or moving the object (e.g., in 360 degree circular direction) in front of a stationary camera. In each image, the system attempts to determine those pixels which form the boundary contours of the object's silhouette and create from the multiple silhouette images a 3D mesh model of the object. Such systems capture all needed data (both spatial mesh construction data and texture data) in a single series of images. The brightly-colored background used in this silhouette capturing approach enables the system to differentiate (with some accuracy) those pixels which describe the boundaries of the object (the pixels which are used to generate the 3D X, Y, Z coordinates of the 3D model). In addition, the distinctive background also allows such a system to extract (with some accuracy) information about the texture data (used in later processing) from the same photographic image.
However, the use of brightly or specifically colored backgrounds (such as using an all green background) also introduces a level of error into the modeling procedure. It is known that when a background color is introduced, the color will also radiate on the object. When an object is placed in front of such a distinctive background, the object will typically reflect some of the background color. Thus, the 3D model created from the processing of such object modeling systems will sometimes be inaccurate, because those models will have some of the background color residue included in their texture. For example, when a green background color or light is used, some of the greenish hue can appear on the model's texture.
Further, in such an instance the reflection of such a green light color against the object can also hamper the system's efforts to collect the spatial data concerning the object. Such systems will sometimes have additional difficulty determining if a point within an image belongs to the object or the background—especially if the object has (or is radiating) a patch of color that is the same as the background. It would be an improvement in the art if a system could be developed that would improve the accuracy of the capturing of spatial and texture data and eliminate the need for the use of the brightly colored backgrounds.
In addition and as noted above, some currently available silhouette capture-based 3D modeling systems use a volumetric approach to calculate the spatial 3D X, Y, Z, points of the object model. The use of a volumetric approach can also cause difficulties in processing and inaccuracies in the final object model. As stated above, one typical volumetric approach begins with a cube of pixels (e.g., for example, 1000×1000×1000 pixels) and the shape of the object is “carved” from the cube using silhouette outlines. This approach has the limitation that it is necessary to start with a fixed-sized grid. The use of the grid limits the resolution of the final 3D object model to the resolution of the grid. It would be an improvement in the art if a system and method could be devised to determine the shape of the object without the use of the static cube structure; if, for example, the object could be determined analytically by using the silhouettes themselves.
Such an improvement would be found if the generated 3D X, Y, Z coordinates of the silhouette contours could be used directly to construct the 3D model. Such direct use of the 3D X, Y, Z coordinates would improve the accuracy of the model. It is understood that one of the challenges in creating a 3D modeling system based on silhouette capture which builds a 3D model with actual (non-gridded) 3D X, Y, Z coordinates is to perform the more accurate construction of the model with efficiency and speed of operation.