Various image processing techniques are applied to pairs of approximately similar images. For example, two images of essentially the same scene taken at different times can be compared to detect scene changes. Another example is in the domain of stereoscopic vision, where the depth of points and objects in a given scene image may be recovered from multiple images from respective corresponding multiple cameras.
Such image processing techniques usually involve defining certain image points in the images, for example corners and edges, and resolving point correspondences between the two images. In one common approach for defining image points, Harris corners are determined using a Harris corner interest operator, as described in the reference “A Combined Corner and Edge Detector”, C. Harris and M. Stephens, Proceedings of the 4th Alvey Vision Conference, 1988, pages 147-151.
A homography represents a transformation from one plane to another. Homography matrices are commonly created from “known” correspondences between image sets. These correspondences are points which should match between the image views, and are commonly chosen based on criteria such as how salient the image region is, or how much like a corner the feature represented in the image is. These points of interest are chosen as they have been shown to be robustly repeatable across image sets. Once these points of interest have been located and corresponding points have been found (usually using a matching criteria based on the similarity of the raw pixel values, which represents how similar the areas of the image “look”), a further filtering technique is used to discard false matches. False matches are correspondences between image features which do not in fact match. These are common in real world images, and can cause considerable error to the final homography matrix if allowed to remain undiscovered in the sample set. The sample set in this case is the set of possible correspondences between the image sets.
Various techniques are known for the filtering of point correspondences. Two of the most popular techniques are Least Median Squares (LMedS) estimation and Random Sample Consensus (RANSAC). Of the two, RANSAC is the more popular technique, as it can cope with a considerable number of outliers (in this case outliers refers to false correspondences), whereas LMedS can cope with only 50% of the sample set being outliers. Under RANSAC, combinations or sets of points are chosen at random and their mapping i.e. a model is used to calculate an error for all the points compared to that mapping, and if the error is the lowest (i.e. the best score) yet determined, their mapping is stored as the best solution. After an escape condition, i.e. an end of processing criteria, is reached (e.g. a given amount of processing time or real time, or a given number of combinations), the currently stored solution is used as the best solution. RANSAC is further described in “Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography”, M. A. Fischler and R. C. Bolles, Communications of the Association for Computing Machinery, vol. 24, pages 381-395, 1981.
Once the outliers have been culled to the best of the ability of the filtering technique used, the remaining “good” points, called inliers, are transformed into a homography estimate. This process is often performed using a least squared approximation technique, such as Direct Linear Transform as described in “Multiple View geometry in Computer Vision”, R. Hartley and A. Zisserman, Cambridge University Press, 2003, ISBN 978-0-521-54051-3.
The above approaches for the filtering of point correspondences (LmedS, RANSAC etc.) have proved successful over the years. However, they all tend to leave a certain number of outliers present, and this is exacerbated if the amount of processing allowed for the process is relatively low.