Creation of photo-realistic and non photo-realistic three-dimensional (3D) calibrated models of observed scenes and objects has been an active research topic for many years and many commercial systems exist. Such 3D models can be used for visualization, virtual presence, operations planning and rehearsal, training and measurements. They are useful for many applications including planetary rover exploration, autonomous vehicle guidance, navigation and operation, industrial automation and robotics, forensics, mining, geology, archaeology, real estate, virtual reality, computer games, etc.
Existing systems use sensors and techniques such as rangefinders (scanning and non-scanning), and stereo and monocular camera images to obtain 3D data. As data sets obtained from one sensor location do not show the complete object/environment surface due to insufficient field of view, depth of field or resolution of the sensor and/or visibility, it is necessary to move the sensor into another location to acquire another 3D view.
Multiple 3D data sets obtained from different sensors positions may be registered together to form one complete model using either external systems or by selecting and matching features observed in multiple views. External position measuring systems such as: 3D tracking devices, Global Position Systems, telemetry of manipulators or other positioning devices, translation and orientation sensors are often used. The observed features may already exist in the scene or on the object or may be placed there. The preferred case is when only existing features are used, however, in the prior art this is not as reliable and accurate as using artificial features (markers, beacons). Feature selection and matching of observed objects is often performed manually, which is labour intensive and inaccurate. Automatic feature selection and matching algorithms exist but are less accurate and reliable.
Creating 3D models of an environment often requires fusing data from different sensors. One sensor (especially with fixed optics and at one stand-off distance) cannot provide the resolution and depth of field required for the whole range of operations, e.g., room model and blood spatter analysis. Data from multi-modal sensors has to be fused together, e.g., room model and close-up images of fingerprints. At present, this problem is dealt with using manual data registration using existing features visible in images from multiple cameras, installation of unique targets that make the manual or automatic registration easier, and a GPS-like system that tracks position and orientation of cameras and sensors (Magnetic (e.g., Polhemous), LEDs (e.g., Optotrack), optical tracker (e.g., 3rdtech)).
In the case of underground mine mapping, particularly, after generating 3D mine models using stereo cameras, it is difficult to register the models accurately with the mine map. It would be highly desirable to have one device that can automate the process to capture geological, geotechnical, survey and other management information, and as a result, only one individual will need to collect data for use by everyone. In the existing art, 3D modelling systems (both laser-based and camera-based) are not able to register to the mine map accurately themselves and require additional equipment. On the other hand, total stations can locate themselves accurately but only provide very sparse 3D point data without photo-realism.
U.S. Pat. No. 6,009,359 issued to El-Hakim et al. discloses a mobile 3D imaging system which includes a movable platform; several image cameras mounted on the movable platform for capturing intensity images of the region being imaged. The system includes a range imaging device coupled to the movable platform in a known relationship to the cameras. A 3D model is obtained by correlating the intensity images and the range images using knowledge of the predetermined locations of the cameras and the range imaging device and generating a model in dependence upon the correlation. This system uses a scanning rangefinder to capture range information, and separate cameras to capture images and to determine location of the mobile platform. As the scanning rangefinder collects the range data sequentially the mobile platform must remain stationary during the acquisition. The scanning rangefinders are relatively larger, more expensive and more susceptible to shock and vibration as compared to stereo cameras proposed in this invention. Additionally, the stereo cameras can capture images within much shorter time (in order of microseconds or less) than scanning rangefinders (seconds to minutes) allowing for operation from a mobile platform without stopping for data acquisition. The proposed solution uses the same cameras to capture images used for localization and 3D computation. These factors reduce the size, weight and cost, and increase the robustness of a camera based 3D modeling systems as compared with systems that use scanning rangefinders.
U.S. Pat. No. 6,781,618 issued to Beardsley discloses a method for constructing a 3D model of a scene using two cameras having a physical relationship together. The first camera is used to acquire images of unknown scene, from which a model is created and the second camera acquires images of a special registration pattern or a rigid structure. The limitation of this method is that it requires placing the registration pattern in the modeled environment and that the pattern always be visible in the second camera images.
U.S. Pat. No. 6,711,293 issued to Lowe discloses a method and apparatus for identifying scale invariant features in an image and use of same for locating an object in an image. Lowe detects scale invariant features in training images of objects and stores them in a database. The objects are recognized in images by detecting features in new images by matching these features with features detected previously and stored in a database. The features are two dimensional only as Lowe uses a monocular camera, and he does not match the features temporally to recover the camera motion.
U.S. Pat. No. 4,991,095 issued to Swanson is directed to a method of mathematical modeling of underground geological volumes for mapping layers of sedimentary deposits which models geologic volumes having critical bounding surfaces and inclined, stacked layers of sedimentary deposits. The method involves composing a model volume analogous to the actual volume wherein the model volume includes layers of cells arranged in vertical columns of cells, which are inclined and stacked analogous to the layers of deposits in the actual volume.
Therefore, it would be very advantageous to provide a method for creating three-dimensional (3D) computer models which avoids the above mentioned drawbacks.