In many applications, it is necessary to determine the location or, more generally, the pose (multi-degree-of-freedom position) of an object in a scene using machine vision, and various techniques exist for performing such a search. Such techniques generally involve a method for systematically visiting a plurality of locations in an image, and at each location visited, performing a computation using a model and a match-quality metric. The match-quality metric represents the quality of the match between the model at that location (pose) in the image and the pattern in the image. For example, one commonly used match-quality metric is normalized (cross) correlation, which computes a score between equal-sized regions of two images using a particular formula, such as the formula on page 653 of Pratt, Digital Image Processing, 2nd Ed., Wiley-Interscience. A simple high-level algorithm might compute the normalized correlation metric at all translations of the model image where the model image would fit entirely within the image of the scene, and the location would be the translation at which the score was greatest.
A limitation that is typical of many pattern location algorithms, and of normalized-correlation-based pattern location algorithms in particular, is an inability to find the correct location if the object is not entirely contained within the field of view, and thus only part of the pattern occurs in the image.
A fairly simple way to work around this limitation is to somehow extend the original image, where the amount of extension determines the range of additional candidate locations that could be compared. The image might be extended in a number of ways, e.g., by filling the extra region with pixels of a constant value or with pixels representing a periodic repetition of the image. However, these techniques suffer from the fundamental limitation that the match-quality metric will eventually be applied to pixel data which are not truly meaningful, thus leading to misleading match-quality metrics at these locations, and consequently to a possibly incorrect location found by the overall search algorithm.
For example, consider the technique of extending the image by appending pixels of a constant value, and using normalized correlation as the metric. Even for a location where the model correctly matches the portion of a pattern that falls within the image, the normalized correlation match-quality metric will return a reduced value, because the region of the image that includes the pixels of a constant value usually does not match the portion of the model that extends beyond the image. It is similarly possible to report an abnormally high value if the model does happen to have a constant region at its border.
In general, the metric is applied to the model at a given location (pose) in the image, where the pose may consist of just a translation, or may include additional degrees of freedom, such as rotation, scale, aspect, and shear, for example. In this general case, the search algorithm usually generates a set of possible poses of the model, and applies the metric at each pose. The usual limitation is that only poses of the model wherein the entire model overlaps the image can be considered. In the typical normalized correlation approach, the set of poses that would be considered would be all translations in integer pixel increments such that the model still falls within the image. Note that even when using normalized correlation as the metric, it is possible to consider more general poses, typically by digitally transforming the model (or image) pixels according to the pose (e.g., by rotating, scaling, and/or shearing), in addition to translations of the model.
However, known approaches degrade or fail when the model extends beyond the boundary of the image, providing incorrect locations of the model.