In computer vision and computer graphics, 3D reconstruction is the process of capturing the shape and appearance of real objects based on recorded images. Typically, 3D reconstruction requires a plurality of 2D images in order to extrapolate the 3D appearance of the object, wherein a registration process is utilized to determine how a new image should be aggregated with the 3D reconstruction.
One of the challenges of 3D reconstruction is how to make it computationally practical, while robust enough to accurately represent 3D scenes. Three prominent classes of reconstruction algorithms are most widely utilized today, but each suffers from drawbacks associated either with robustness and/or computational efficiency. For example, the Kinect Fusion algorithm is configured to run on graphics processing units (GPUs) to increase computational efficiency. It utilizes an iterative closed point (ICP) algorithm wherein a structured point cloud is registered with a merged model of all the previous point clouds. The registration process relies on a point-to-plane formulation of the iterative closed point (ICP) algorithm, in which each iteration should provide a better match between the structured point cloud and the merged model of previous point clouds. However, the Kinect Fusion algorithm is more sensitive to outliers in the registration process and reconstruction drift, and therefore suffers from a lack of robustness. Another reconstruction algorithm is the Simultaneous Localization and Mapping (SLAM) algorithm, which estimates the relative motion between two successive camera acquisitions in real-time in order to determine the orientation of the camera. However, a drawback of most SLAM methods (there are several variants) is the accumulation of registration error with time, which leads to drift in the determined orientation.
It would be beneficial to develop a system that could provide robust results, while reducing the computational expense of current reconstruction techniques.