Industrial products exist in the market for three-dimensional digitization for various purposes. Examples include medical applications, entertainment industry applications (e.g., three-dimensional gaming, filming, and animation), fashion design (e.g., three-dimensional garment design, apparel fitting, and plastic surgery), archaeological restoration and/or preservation, forensic applications (e.g., crime scene investigation), and online commodity exhibition (e.g., online museum and online store).
There are, in general, two categories of three-dimensional digitizing techniques: active sensing and passive sensing. Techniques belonging to the first category, active sensing, usually emit certain energy (e.g., light and/or sound etc.) toward the scene to be measured/observed, and receive the reflected energy or observe the reflected pattern, making use of the physics law in optics or acoustics to derive the distance from the sensor to the object in the scene. Active sensing usually needs a complex and sophisticated optical design of the lighting components, and it usually needs controlled ambient lighting to assist in the three-dimensional capturing. Sensors within this category are usually limited to sensing static scenes/objects because they usually need a certain amount of time to accomplish the scanning procedure due to the normal requirement of physically moving certain components in the scanning systems (e.g., components for emitting lasers within this category need to be moved to scan different lines of the object). Laser scanning, moire fringe contouring, time of flight, and structured lighting are among the active three-dimensional sensing techniques.
The techniques in the second category of passive sensing, on the contrary, usually do not emit energy toward the scene. Instead, these techniques capture certain signals that are available in the scene, such as intensity and/or color and, by analyzing these signals along with sensor configuration information, these techniques obtain three-dimensional information for the scene. Stereovision (two or more cameras) is a typical example of passive three-dimensional sensing.
Passive sensing usually does not need a complex optical design. For example, a stereovision system usually takes a snapshot of the scene/object and recovers the three-dimensional information with simple devices. Some systems also integrate more cameras in one system to capture both three-dimensional information and color texture information from the scene/object. Systems with sufficiently fast computer CPU time also may handle dynamic scenes. To ensure the stereo cue has sufficient features to match the two views, stereovision-based systems usually need to introduce some additional features onto the scene/object. Projectors (e.g., slide projector or an LCD) are often used to project such patterns onto the surface. In such systems, the pattern is switched on and off in order to capture both (1) the image with the superimposed features and (2) the color texture image of the scene/object without the superimposed features. This generally requires a certain mechanism to turn the pattern on and off. In addition, in situations in which the object of interest is a human being, illuminating patterns onto the face of the human being may cause discomfort to the eyes.
Known stereo systems establish a correspondence between the two stereo views. In general, there are mainly two types of methods for computing the correspondence, or matching. The first method is a feature-based method, which usually generates matches for those positions in the images that have abundant information about the scene, such as corners, edges, and line segments. The second method is an area-based matching technique, which matches the two views based on pixel similarity in local image regions. The feature-based method (the first method) uses surface texture feature information and generates matches for a limited number of pixels. The area-based method (the second method), is typically computationally more expensive, but is typically able to generate dense matches. For three-dimensional digitizing, the higher the resolution at which the three-dimensional surface is sampled, the better the surface is usually captured. The feature-based stereo matching method typically does not provide sufficient matched points for this purpose. The area-based stereo matching method can typically generate sufficient numbers of three-dimensional samples on the surface, however, this method may have a long computation time, especially for high resolution capturing.