Three-dimensional (3D) modeling of objects has become prevalent for many applications including robotics, navigation, gaming, virtual reality and 3D printing. 3D modeling may include capturing information about an object, such as information about the shape and surface of the object, and generating a 3D model of the object. The 3D model represents the object using a collection of points in 3D space.
3D reconstruction may be used to build a 3D model of an object. 3D reconstruction includes the creation of a 3D model from multiple images. The images may be two-dimensional images capturing a scene. If multiple images of the scene are captured then depth may be determined through triangulation, or a depth camera may be used to measure depth to determine the location in 3D space of objects in the scene. Then, data captured for each image is fused together to determine a 3D mesh of the scene to construct the 3D model.
In order to fuse the data captured from different viewpoints to create a 3D model of the scene, motion estimation (ME) is performed to determine the pose (e.g., pan, tilt, roll, X, Y, Z) of the camera for each video frame of the captured scene. For some applications, ME may be performed in real-time at a full frame rate, e.g., 30 frames per second (fps). For example, for interactive 3D reconstruction, users see the reconstructed model in real-time as video of the scene is captured and can adjust the camera accordingly based on the visual representation of the model.