Augmented reality (AR) relates to providing an augmented real-world environment where the perception of a real-world environment (or data representing a real-world environment) is augmented or modified with computer-generated virtual data. For example, data representing a real-world environment may be captured in real-time using sensory input devices such as a camera or microphone and augmented with computer-generated virtual data including virtual images and virtual sounds. The virtual data may also include information related to the real-world environment such as a text description associated with a real-world object in the real-world environment. An AR environment may be used to enhance numerous applications including video game, mapping, navigation, and mobile device applications.
Some AR environments enable the perception of real-time interaction between real objects (i.e., objects existing in a particular real-world environment) and virtual objects (i.e., objects that do not exist in the particular real-world environment). In order to realistically integrate the virtual objects into an AR environment, an AR system typically performs several steps including mapping and localization. Mapping relates to the process of generating a map of the real-world environment. Localization relates to the process of locating a particular point of view or pose relative to the map. A fundamental requirement of many AR systems is the ability to localize the pose of a mobile device moving within a real-world environment in order to determine the particular view associated with the mobile device that needs to be augmented.
In robotics, traditional methods employing simultaneous localization and mapping (SLAM) techniques have been used by robots and autonomous vehicles in order to build a map of an unknown environment (or to update a map within a known environment) while simultaneously tracking their current location for navigation purposes. Most SLAM approaches are incremental, meaning that they iteratively update the map and then update the estimated camera pose in the same process. An extension of SLAM is parallel tracking and mapping (PTAM), which separates the mapping and localization steps into parallel computation threads. Both SLAM and PTAM techniques produce sparse point clouds as maps. Sparse point clouds may be sufficient for enabling camera localization, but may not be sufficient for enabling complex augmented reality applications such as those that must handle collisions and occlusions due to the interaction of real objects and virtual objects. Both SLAM and PTAM techniques utilize a common sensing source for both the mapping and localization steps.