Accurate estimation of the ego-motion of a mobile agent, for example, a vehicle, a human, a robot and other mobile agents, relative to a surface path on which the mobile agent is traveling may be a key component for autonomous driving and computer-vision-based driving assistance. The use of one or more cameras, as opposed to other sensors, for computing ego-motion may allow for the integration of ego-motion data into other vision-based algorithms, for example, obstacle detection and/or avoidance, pedestrian detection and/or avoidance, object detection and/or avoidance and other vision-based algorithms, without necessitating calibration between sensors. This may reduce maintenance requirements and cost. The process of estimating the ego-motion of a mobile agent using only input from one or more cameras attached to the mobile agent is referred to as Visual Odometry (VO).
In VO, the pose of a mobile agent is incrementally estimated via examination of changes that the motion of the mobile agent induces on the images obtained by the one or more onboard cameras. For VO to work effectively, sufficient illumination in the environment and a static scene with sufficient texture to allow apparent motion to be extracted may be required. Additionally, temporally consecutive frames should be captured to ensure sufficient scene overlap.
One advantage to VO for providing a motion estimate is that VO is not affected by wheel slip in uneven terrain and other adverse conditions. Furthermore, VO may provide important supplemental information to other motion-estimation processes and systems, for example, Global Positioning System (GPS), Inertial Measurement Units (IMUs), laser odometry and other systems providing motion estimation. Additionally, in GPS-denied environments, for example, underwater, aerial and other environments wherein GPS may be denied, and environments wherein GPS information is not reliable, for example, due to multipath, poor satellite coverage and other reliability factors, VO may have increased importance.
Many motion-estimation algorithms for estimating motion using exclusively video input assume static scenes. Additionally, many motion-estimation algorithms for estimating motion using exclusively video input cannot cope with dynamic and/or cluttered environments or large occlusions generated by passing vehicles and other objects. Furthermore, feature matching and outlier removal in motion estimation may not be robust and may subsequently fail. Many motion-estimation schemes require a significant number of key points and may fail when a limited number of key points are available in scenes absent of structure.
Real-time VO methods and systems that do not rely on the above-listed assumptions and overcome the above-listed limitations may be desirable.