Machine Vision has used numerous algorithms over the years to locate the object of interest in the Field Of View (FOV). Typically the user captures a reference image and identifies the object of interest by drawing a Region Of Interest (ROI) around it using a Graphical User Interface (GUI). The user also draws a search ROI to identify the maximum movement of the object of interest within the search ROI. Either the entire grayscale ROI or unique characteristics of the object of interest is saved in memory as a reference template. During an inspection the vision algorithm searches for the object of interest within the search ROI and if one is found and the match is within acceptable limits it can provide the match level and both the location and angular rotation of the object within the FOV. Key parameters are accuracy of match, accuracy of location, accuracy of rotation, and execution time of the algorithm
A vision algorithm that the machine vision industry has traditionally used is the Normalized Grayscale Correlation (NGC) algorithm. This algorithm basically performs a cross-correlation between a “reference image” and an “inspection image” using normalized values. Cross-Correlation is a mathematical process of sliding the reference template across the search ROI and creating a 2D array of correlation results. The individual correlation results give an indication of the degree of match of the reference pattern with that particular location in the search ROI.
Normalization is a modification to the correlation algorithm to reduce the affects of changes in illumination on the correlation results.
Correlation is susceptible to minor rotational changes of the object and will fail if the inspection image is rotated more than approximately ±5 degrees from the reference template. Therefore the vision algorithm handles rotational variations of the inspection image by performing correlation with rotated versions of the reference template.
One form of the NGC formula is:r=Numerator/Denominator whereNumerator=(N*Σ(I*R))−(ΣI*ΣR)Denominator={(N*ΣI2−(ΣI)2)*(N*ΣR2−(ΣR)2)}1/2N=number of pixels in reference templateI=inspection imageR=reference template
From the above formula it is easily seen that calculating a single correlation value requires numerous multiplications, additions, subtractions, divides, and square-roots. This needs to be repeated as the template is slid around the entire search ROI. The number of correlation results is (m2−m1+1)*(n2−n1+1) where the search ROI is m2×n2 and the object ROI is m1×n1. As an example assume that the m2=250, n2=200 (250×200), m1=100, and n1=100 (100×100) then the total number of correlation values is 15251. This operation requires:                305106257 multiples        457550000 adds        45753 subtracts        15251 divides        15251 square-roots        
Machine Vision applications require the vision algorithm to provide results in real time. This requirement places constraints on the implementation of the NGC algorithm. In order to reduce the computational time of correlation of the reference image with the inspection image different resolution images are created of both the reference image search ROI and inspection image search ROI. The creation of the lower resolution images shall be referred to as decimation.
One approach to creating the decimated image is by taking the average of a 2×2 array to form the next higher pixel value. This filters out the higher spatial frequency content leaving only the lower frequency characteristics of the image. This results in 2 pyramids, one of the reference image ROI, and the other of the inspection image search ROI. Each level of the pyramid reduces the size of the reference image ROI and inspection image search ROI by a factor of 2 in width and 2 in height. The spatial frequency bandwidth is reduced by a factor of 2 at each successive level. The highest resolution images are at the bottom of the pyramid. A course correlation search is then performed at the top of the pyramid. Table 1 shows the reduction in the number of arithmetic operations required at the top of the pyramid for 3 levels of decimation.
TABLE 1SquareLevelm1n1m2n2MultiplesAddsSubtractsDividesRoots1 × 11001002502003051062574575500004575315251152512 × 25050125100194018822907500011628387638764 × 4252562501240567185375029649889888 × 81212312582186121248840280280
Points of interest, large correlation values, are then identified from the results of the course correlation search. Once points of interest have been identified at the top level of the pyramid an approach is required to follow the point(s) of interest down the pyramid to the highest level of resolution without following the wrong path. This reduces the total number of correlation values required to find the object in the inspection image resulting in a faster inspection time.
A shortcoming of the decimation approach is that blurring (loss of spatial frequency) of the image increases at each successive level of the pyramid. If decimation is carried far enough it would result in a uniformly smooth image with no features to be used for identification. Therefore, to increase performance it is desirable to have several levels to the pyramid, but to stop before being unable to find the object of interest due to a loss in the spatial frequency content. The amount of blurring introduced by decimation is also dependent upon the density of the spatial frequency content of the reference image ROI. Higher levels of decimation can be achieved, without significant loss of image content, with images containing lower spatial frequency. A means is therefore needed to determine the maximum number of decimation levels a reference image can tolerate without the loss of the required spatial frequency content necessary to identify the object of interest.
To accurately locate the object of interest the reference image must possess a minimum amount of both translational and rotational spatial frequency content. This requires that the reference image ROI have sufficient high spatial frequency content to produce a sharp correlation peak in the search ROI. Images with only low spatial frequency content have a flatter correlation peak causing minor pattern variations or noise to influence the location of the object of interest in the inspection image. A means is therefore required to determine the amount of both translational and rotational spatial frequency content of the reference image ROI.
Finally, the reference image ROI must not be distorted due to pixel saturation. Saturated pixels cause a loss of detail and may not move as the object of interest moves within the FOV. A means is therefore required to ensure that saturated pixels are minimized within the reference image ROI. The present invention disclosed herein overcomes the above limitations of the prior art machine-based vision systems.