1. Background Field
Embodiments of the subject matter described herein are related pose determination, and more particularly, the use of vision based techniques for pose determination.
2. Relevant Background
In Augmented Reality (AR) type applications, the pose (translation and attitude) of the camera with respect to the imaged environment is determined and tracked. In a vision-only pose approach, the pose of the camera with respect to a feature rich target in the environment is determined and tracked using captured images, e.g., frames of video. The vision-only pose is estimated, e.g., at every frame and statistical models are used to predict the pose at the next frame, providing an initialization point for the pose refinement algorithm.
Modern devices, such as cellular telephones, are typically equipped with inertial sensors that are capable of measuring the rate of change in the pose of the device relative to the inertial frame, which is known as an Inertial Navigation System (INS). The information provided by INS can be used to improve vision only pose estimates of the camera relative to the target because the absolute pose, i.e., the pose of the device with respect to the inertial frame, and the relative poses, i.e., the pose of the camera with respect to a target, differ by a constant transformation. The combination of vision only pose and INS is typically referred to as Vision aided INS (VINS).
The VINS approach uses more information than either vision only pose or INS separately, and thus, in generally VINS performs better than either method alone. Nevertheless, under certain circumstances the VINS approach performs poorly compared to the vision-only approach. Moreover, the performance of the VINS approach may degrade based on conditions external to the mobile device, and thus, the degradation may be unpredictable.