Estimating a two-dimensional (2D) homography (or projective transformation) from a pair of images is a fundamental task in computer vision. The homography is an essential part of monocular simultaneous localization and mapping (SLAM) systems in scenarios that include rotation only movements, planar scenes, and/or scenes in which objects are very far from the viewer. It is well-known that the transformation relating two images undergoing a rotation about the camera center is a homography, and it is not surprising that homographies are essential for creating panoramas. To deal with planar and mostly-planar scenes, the popular SLAM algorithm ORB-SLAM uses a combination of homography estimation and fundamental matrix estimation. Additional applications of homographies include augmented reality and camera calibration.
A conventional approach to homography estimation includes two stages: corner estimation and robust homography estimation. Robustness is introduced into the corner detection stage by returning a large and over-complete set of points, while robustness into the homography estimation step shows up as heavy use of RANSAC or robustification of the squared loss function. Since corners are not as reliable as man-made linear structures, the research community has put considerable effort into adding line features and more complicated geometries into the feature detection step. There is a need in the art for a single robust algorithm that, given a pair of images, returns the homography relating the pair.