The present invention relates to Monocular SFM and Moving Object Localization.
Stereo-based SFM systems now routinely achieve real-time performance in both indoor and outdoor environments. Several monocular systems have also demonstrated good performance in smaller desktop or indoor environments. Successful large-scale monocular systems for autonomous navigation are less extant, primarily due to the challenge of scale drift. A large-scale monocular system handles scale drift with loop closure. While desirable for map building, delayed scale correction from loop closure is not an option for autonomous driving. Parallel monocular architectures like PTAM are elegant solutions for small workspaces. However, PTAM uses the existing distribution of points to restrict the epipolar search range, which is not desirable for fast-moving vehicles. It uses free time in the mapping thread when exploring known regions for data association refinement and bundle adjustment, however, scene revisits are not feasible in autonomous driving. Other system compute relative pose between consecutive frames. However, two-view estimation leads to high translational errors for narrow baseline forward motion.
Monocular SFM and scene understanding are attractive due to lower cost and calibration requirements. However, the lack of a fixed stereo baseline leads to scale drift, which is the primary bottleneck that prevents monocular SFM from attaining accuracy comparable to stereo. To counter scale drift, prior knowledge must be used, a popular avenue for which is the known height of the camera above the ground plane. Thus, a robust and accurate estimation of the ground plane is crucial to achieve good performance in monocular scene understanding. However, in real-world autonomous driving, the ground plane corresponds to a rapidly moving, low-textured road surface, which makes its estimation from image data challenging.