1. Field of the Invention
Embodiments of the present invention generally relate to three-dimensional navigation systems and, more particularly, to a method and apparatus for generating three-dimensional pose using a monocular visual sensor and an inertial measurement unit.
2. Description of the Related Art
Traditional approaches for computing pose estimations for devices using monocular visual sensors (e.g., single lens cameras) and an inertial measurement unit relied on computing a pose estimate from visually tracked features of a scene and building a measurement model, based on the pose estimate. This traditional method led to scale ambiguity problems, as actual scale could not be determined properly from the features as well as problems with uncertainty propagation due to the highly non-linear nature of pose estimation from monocular feature correspondences over several frames. Typically, pose covariance estimation is obtained via back propagation of the covariance method, where the goal is to deduce the uncertainty in the pose estimate from the covariance of the feature correspondences. However, in such a framework, measurement uncertainty is severely underestimated due to non-linearities. Outlier feature rejection becomes problematic, since, in order to reject bad pose measurements, one needs a mechanism to compare the predicted pose against the measurement, and the measurement suffers from a poor uncertainty model.
Real-time tracking by fusing information available from visual and inertial sensors (e.g., an inertial measurement unit (IMU)) has been studied for many years with numerous applications in robotics, vehicle navigation and augmented reality. However, it is still unclear how to best combine the information from these complementary sensors. Since inertial sensors are suited for handling situations where vision is lost due to fast motion or occlusion, many researchers use inertial data as backup or take only partial information (gyroscopes) from an IMU to support vision-based tracking systems.
To better exploit inertial data, several researchers use an extended Kalman filter to fuse all measurements uniformly to a pose estimate. These systems combine the filter with vision-tracking techniques based on artificial markers, feature points, or lines. Results from these Kalman filter-based systems indicate that using vision measurements effectively reduce the errors accumulated from IMU. However, these systems have not eliminated the problem of measurement uncertainty and scale ambiguity.
Therefore, there is a need in the art for improved pose computation using a method and apparatus for generating three-dimensional pose using a monocular visual sensor and an inertial measurement unit (IMU).