This invention relates to template matching within a data domain, and more particularly to a method for locating a given data template within a data domain.
Template matching in the context of an image search is a process of locating the position of a subimage within an image of the same, or more typically, a larger size. The subimage is referred to as the template and the larger image is referred to as the search area. The template matching process involves shifting the template over the search area and computing a similarity between the template and the window of the search area over which the template lies. Another step involves determining a single or a set of matched positions in which there is a good similarity measure between the template and the search area window.
A common technique for measuring similarity in template matching and image registration is cross-correlation. A correlation measure is determined between the template and respective windows of the search area to find the template position which has maximum correlation. For a two-dimensional search area the correlation function generally is computed for all translations of the template within the search area. A statistical correlation measure is a common approach in which window areas are spatially convolved with the template using spatial filter functions. Because this approach is extremely expensive in terms of computation time, a more common computer implementation is to use a sum of absolute differences.
Rosenfeld et al., in xe2x80x9cCoarse-Fine Template Matching,xe2x80x9d IEEE Transactions on Systems, Man and Cybernetics (February 1977, pp. 104-107) describe an approach where a xe2x80x98reduced-resolutionxe2x80x99 template is used during a first, coarse evaluation stage. The template is divided into blocks of equal size (e.g., xe2x80x98mxe2x80x99 pixels per block). The average of each block is computed. For each pixel of the search area an average also is calculated over a neighborhood of the same size as the reduced-resolution template (e.g., m pixels). The average absolute difference between each template block average and the picture neighborhood average then is computed for each pixel of the search area. If the average absolute difference for any pixel of the search area is below a threshold value, then a possible match has been identified. Next, the full resolution template is compared to a window of the search area about each pixel point where the average absolute difference in the prior coarse evaluation step was below the threshold value. This fine evaluation step identifies if there actually is a good correlation.
Goshtasby et al. in xe2x80x9cA Two-Stage Correlation Approach to Template Matching,xe2x80x9d IEEE Transaction on Pattern Analysis and Machine Intelligence, (Vol. PAMI-6, No. 3, May 1984), note the need for an accurate threshold value for the first stage evaluation. They describe a method for deriving the threshold value based upon sub-template size and false dismissal probability.
The coarse-fine or two stage method subsample the template to match with the image. The task of subsampling the template is not a trivial task and contributes significant processing cost. In addition, the false alarms result in wasted, or an ineffective use of, processing time. Accordingly, there is a need for a more efficient method of template matching.
In the area of motion estimation for digital video and multimedia communications a three stage correlation strategy is used. In a first step, a search step size of 4 is used. Once a maximum point is found, the step size is reduced to 2 to evaluate the neighborhood of the previously determined point to choose the next search point. The third step is to search all neighboring points to find the best match. This approach speeds up the search process, but also has a high probability of mismatches or suboptimal matches. It also has difficulty handling cases in which multiple match points occur. Thus, there is a need for a more reliable, fast search method for correlating a template to windows of a search area.
According to the invention, a correlation auto-predictive search method is used to compare a template to windows of a search area. The location(s) where the template has the highest correlation coefficient with the underlying window is selected as a match for the template. Local maximum criteria or other criteria then are used to select one or more match points within the search area. The principle of a correlation auto-predictive search as conceived by the inventors is (1) to extract statistical information from the template itself to determine the search step size, and (2) to perform fast searching based on this extracted information.
According to one aspect of the invention, during a first analytical step, autocorrelation is performed on the template to generate desired statistics. To use autocorrelation, the original template is padded with additional pixels to increase the template size. In one approach, where the search area is assumed to be periodic, circular padding is used. In such approach the padded template is an array of copies of the original template. This increases the template size to the search area (image) size.
In another approach linear padding is used in which pixels are added around the original template to increase the size of the template to the search area (image) size. According to an aspect of this invention, a mean pixel value of the original template is used as a padding constant (i.e., pixel value for the added pixels). Alternatively, a value of zero or another fixed value may be used as the padding constant for the padded pixels.
After generating the padded template, cross-correlation is performed between the padded template and the original template. The autocorrelation is highest at the center of the padded template as this area is formed by the original template. This corresponds to a peak in a graph of the autocorrelation of the padded template to original template. The width of the peak, either along a horizontal direction of the padded template, or along a vertical direction of the padded template, may be measured. The height of the maximum peak is 1.0. The horizontal width is taken as the distance along the horizontal axis between autocorrelation values of 0.5 to each side of the maximum peak. Similarly, the vertical width is taken as the distance along the vertical axis between autocorrelation values of 0.5 to each side of the maximum peak. Such value, 0.5, is referred to herein as the cut value. The cut value may differ.
According to another aspect of the invention, the autocorrelation between the padded template and the original template is not calculated for every point of the padded template. At the center of the padded template, the correlation is known to be 1.0 because the original template is located at such center of the padded template. The correlation then is derived about the center of the padded template along both horizontal and vertical axes. As the correlations are derived during this stepping along the axes, there comes a point where the correlation decreases to the cut value. Along the horizontal axis, there is a cut value reached to either direction of center. The horizontal distance between these two locations where the correlation has decreased to the cut value is the horizontal width. Further correlations along such axis need not be derived. Along the vertical axis, there also is a cut value reached to either direction of center. The vertical distance between these two locations where the correlation has decreased to the cut value is the vertical width. Further correlations along such vertical axis need not be derived. Thus, correlation coefficients are derived only for the steps along the axes away from center, and only to the step where the cut value is reached.
Next, horizontal step size and vertical step size are derived from the horizontal width and vertical width, respectively. In one embodiment the horizontal step size is 0.5 times the horizontal width. Similarly, a vertical step size is 0.5 times the vertical width. These step sizes are the correlative auto-predictive search (CAPS) step sizes. No additional correlation values need be derived between the padded template and the original template. The CAPS step sizes, then are used for template matching between the original template and the search area.
According to another aspect of this invention, a fast search then is performed between the template and the search area using the derived step sizes. Then, for correlations having a correlation coefficient exceeding a specific value, a full search is performed locally in each area where the fast search resulted a correlation coefficient exceeding the select value.
According to another aspect of the invention, the fast search is performed as a set of correlation between the original template and the search area. Specifically, a correlation is performed between the template and a window area within the search area. The set of correlations is selected by choosing window areas based upon the step size. For example, one window is the center of the search area. A positive or negative step then is taken along an axis using the corresponding horizontal width or vertical width to derive a correlation for another window. Any of the correlations which result in a correlation coefficient exceeding a specific value is considered a local match point.
According to another aspect of this invention, the specific value used to identify a local match during the fast search is the cut value times a threshold value. The cut value is the same cut value used during the first analytical step described above, to derive statistics from the template. The threshold value is assigned based upon image characteristics. Typical threshold values are between 0.8 and 0.9.
One or more locations are identified as local match points based upon the whether the correlation coefficient between the template and that location exceed the specific value (e.g., cut value times threshold value).
According to another aspect of the invention, a full search then is performed in the vicinity of any location which is a local match. A full search of such vicinity encompasses performing a correlation between the template and every potential search area window between the local match location window and the windows at the prior and next step in each of the horizontal and vertical axes. For example, if the horizontal step size is 3 pixels and the vertical step size is 4 pixels, then correlations are performed for windows xc2x11 pixel and xc2x12 pixels along the horizontal axis and xc2x11 pixel, xc2x12 pixels and xc2x13 pixels along the vertical axis. In addition correlations are performed for windows off the axes within the area delineated by the step sizes. Thus, the full search of the vicinity of the local match for this example includes 34 correlations between the template and the search area. Any locations among the local match locations and the locations tested during the full search of the vicinity which exceed the threshold value are considered template matches. In some embodiments, the only the location having the highest correlation is considered a match. In other embodiments there may be multiple matches. Thus, the top matches or all matches above the threshold are selected as resultant matches.
One advantage of the invention is that template matches are found more quickly and with greater reliability than prior correlation search methods. In particular, this search methodology is more tolerant of noise and offsets of the template as demonstrated empirically by forming a search area from copies of templates altered by low pass filtering or Gaussian noise. These and other aspects and advantages of the invention will be better understood by reference to the following detailed description taken in conjunction with the accompanying drawings.