The task of capturing the 3D information in a scene consists of first acquiring a set of range measurements from the measurement device(s) to each point in the scene, then converting these device-centric range measurements into a set of point locations on a single common coordinate system often referred to as “world coordinates”. Methods to acquire the range measurements may rely heavily on hardware such as 2D time-of-flight laser rangefinder systems which directly measure the ranges to an array of points within the measurement field-of-view. Other systems exist that rely heavily on computing power to determine ranges from a sequence of images as a camera is moved around the object or scene of interest. These later systems are commonly called Structure From Motion systems or SFM. Hardware-intensive solutions have the disadvantages of being bulky and expensive. SFM systems have the disadvantage of requiring extensive computing resources or extended processing times in order to create the 3D representation, thus making them unsuitable for small mobile consumer devices such as smart phones.
Existing Structure from Motion (SFM) systems involve two computation paths, one to track the pose (orientation and position) of the camera as it captures a sequence of 2D images, the other to create a 3D map of the object or environment the camera is moving in or around. These two paths are interdependent in that it is difficult to track the motion (pose) of the camera without some knowledge of the 3D environment through which it is moving, and it is difficult to create a map of the environment from a series of moving camera images without some knowledge of the motion (pose) of the camera.
This invention introduces a method and system for capturing 3D objects and environments that is based on the SFM methodology, but with the addition of a simplified method to track the pose of the camera. This greatly reduces the computational burden and provides a 3D acquisition solution that is compatible with low-computing-power mobile devices. This invention provides a straightforward method to directly track the camera's motion (pose detection) thereby removing a substantial portion of the computing load needed to build the 3D model from a sequence of images.