Simultaneous localization and mapping (SLAM) is used in augmented reality systems and robot navigation to build a target from an environment or scene. Visual SLAM (VSLAM) uses camera or visual sensor data or images as input to build a target or model of the environment. When VSLAM used in conjunction with an Augmented Reality (AR) system, virtual objects can be inserted into a user's view of the real world and displayed on a user device.
A tracking system utilizing VSLAM with a single camera may initialize a 3D target from two separate reference images captured by the single camera. Traditional techniques for VSLAM initialization for 3D targets based on two reference images may require users to perform a specific sequence of unintuitive camera motions between the two reference images while simultaneously maintaining adequate overlap between scenes from both images. The sequence of motions is used by 3D reconstruction methods to find a real plane in the environment and initialize the 3D target from this plane.
While the creation of accurate and high-quality SLAM maps relies on a robust initialization process, the usability of SLAM initialization procedures for end-users has often been disregarded. Therefore, there is a need for systems, methods and interfaces to improve the user-experience for VSLAM initialization.