This invention is related to the generation of three-dimensional (3-D) models from multiple two-dimensional (2-D) images of a scene. More specifically, it can be described as an interactive 3-D reconstruction method in which a reconstruction engine takes model constraints from a user manually.
Image-based 3-D reconstruction of real world objects can be considered as a reverse engineering problem, in which the 3-D geometry of the object (or the scene) has to be found using its 2-D pictures. It forms the basic research problem in the technical fields of computer vision and photogrammetry. A person of ordinary skill in the art would appreciate the problems of this technical field as described below.
Constructing geometric models of real-world objects or places in a computing environment has many fruitful application areas such as production of 3-D animations in the entertainment business (e.g. computer applications, games, advertisements, and movies), making computer-aided design (CAD) and computer-aided manufacturing (CAM) models for production lines, and restoration of buildings. Considerable effort has been invested in the design and development of 3-D modeling software for the fast construction of accurate models.
One approach to modeling of the real world is the use of laser-scanner and image-based reconstruction systems, still being developed through intensive research in computer vision. The laser-scanner based systems suffer especially from high production costs, which hinder their widespread use. Image-based alternatives such as the PhotoModeler Scanner (by EOS Systems Inc. Vancouver, BC, Canada) suffer from lack of robustness in the highly complex problem of image matching. As a matter of fact both of the approaches are incapable of reconstructing semitransparent and specular surfaces, making them unsuitable for a wide range of operational environments. Due to the deficiencies of fully automatic scanner like products, cheap and semiautomatic interactive modeling systems relying much on human supplied constraints to alleviate or completely avoid the complexity of the image matching problem have emerged in the market.
Given multiple images of a scene, the image-based interactive 3-D modeling systems. Given multiple images of a scene, these systems collect some matching point features and angular relations to estimate parameters such as 3-D location, orientation and focal length, and radial distortion of the cameras that take the images. This process is usually referred to as camera calibration. Apart from calibration data, the user supplies the model to be reconstructed in the form of 2-D drawings of the 2-D projections of 3-D primitives making up the model, overlaid on the image. The projection refers to a perspective projection transformation based on a pinhole camera model. In practice there exists a nonlinear deviation from the pinhole model due to the optical lens distortion often represented by a radial transformation on the 2-D image plane. Face boundaries, lines, curves, and vertices of predetermined solid shapes are drawn on the images by marking the corners or tracing the edges in the images. When projections of primitives on multiple cameras or a single projection and some angular constraints such as perpendicularity or parallellity are given, it then becomes possible to reconstruct the model, which is defined to be the 3-D geometry that gives the minimum projection error when projected on the input images. A projection error is the distance on the image plane between the projection of the primitive and the corresponding feature supplied by the user's drawings.
Another interactive modeling system has an interactive reconstruction system that uses panoramic images instead of narrow angle regular images to enable accurate estimation of camera rotation. The user specifies sets of parallel lines and 3-D scene coordinates of some points for camera calibration. Similar to previously mentioned systems, this system relies on the availability of such features in the scene, which are expected to be man-made environments (e.g., buildings). The 3-D model is assumed to be made up of connected planes, whose vertices, orientations, or any combination of the two, relative to other planes are specified by the user. Finally, a system of equations employing all geometric constraints is solved to reconstruct the scene.
Another approach chooses to avoid the complex constraint-based reconstruction and just lets the user draw directly in 3-D while tracking the similarity of the drawing's projection to the features in the image. First cameras are calibrated by matching and scaling vanishing points determined by user specified sets of parallel lines, and then the user is allowed to draw overlaid on the images. The interface guides the user to follow directional constraints perpendicular or parallel to the ground plane or some other parts of the model. When the world coordinate axes are aligned with the ground plane, buildings and similar rectangular structures made up of vertical and parallel planes can be constructed efficiently using the interface due to the directional constraints but it is not very suitable for the reconstruction of other types of objects such as the one made of planes at arbitrary angles to each other. The accuracy of calibration and usefulness of the constraint directions provided by the interface plays an important role in ease of use of this type of interface. Inaccurate calibration and lack of constraints that directly help decrease the projection error leads to a difficult to use interface in which accuracy is sacrificed to a great extent.
A deficiency associated with the interactive reconstruction techniques mentioned so far is the presence of restrictive assumptions on the contents of images and the shapes of the objects (or the scenes) to be reconstructed. These assumptions decrease the generality of the systems. By generality we mean the ability to reconstruct an arbitrary object (or scene). For example, an assumption that the scene is made up planes that are oriented parallel or vertical to each other allows easy and fast reconstruction of buildings or similar structures but reconstruction of shapes such as cylinders become cumbersome. One should device new interaction methods to extend the set of objects to be reconstructed and it is usually not possible to extend the set in a principled way.
Therefore, there is a need for improved 3-D modeling techniques, especially a technique having a camera-based drawing constraint that enables the user draw directly in 3-D while maintaining very low projection error on the available images in a natural way, without restrictive assumptions on the scene content. The technique can be applied to the reconstruction of all types of geometric shapes including free-form surfaces, which are not possible to reconstruct by previous interactive reconstruction methods.