Object recognition is part of many computer vision applications. It is particularly useful for industrial inspection tasks, where often an image of an object must be aligned with a model of the object. The transformation (pose) obtained by the object recognition process can be used for various tasks, e.g., robot control, pick and place operations, quality control, or inspection tasks. In most cases, the model of the object is generated from an image of the object. Additionally, often the model generation process can be influenced by a set of parameters that must be specified by the user. In order to increase the degree of automation and to improve the ease of use of the recognition system it is highly desirable to determine these parameters automatically.
The present invention provides methods for automatic parameter determination in machine vision in general, and in object recognition in particular. Many machine vision systems use algorithms that demand the user to specify one or more parameters in order to adapt the behavior of the algorithm depending on the current application (see Lisa Gottesfeld Brown. A survey of image registration techniques. ACM Computing Surveys, 24(4): 325-376, December 1992, William J. Rucklidge. Efficiently locating objects using the Hausdorff distance. International Journal of Computer Vision, 24(3): 251-270, 1997, U.S. Pat. No. 6,005,978, EP-A-1 193 642, and Markus Ulrich, Carsten Steger, and Albert Baumgartner. Real-time object recognition using a modified generalized Hough transform, Pattern Recognition, 36(11): 2557-2570, 2003, for example). This is not desirable because of several reasons. First, the user has to know details about the functionality of the algorithm to be able to choose reasonable parameter values. However, in many cases the complexity of the algorithm should be hidden from the user to ensure that the system can be easily operated even by non-experts. Second, if the number of input parameters increases, in many cases it is getting difficult to find the optimum values for the set of parameters even for experts. This is because some of the parameters may interact or the influence of some parameters on the result cannot be predicted well. Consequently, the user has to try different combinations to find the optimum values, which is not feasible when dealing with systems that require the user to specify more than one or two input parameters. Another reason for automatically determining the parameter values of an algorithm is to improve its flexibility. For example, in the industrial production process often the conditions change, requiring the adaptation of the parameters in accordance with the new conditions. Thus, a time-consuming manual adaptation should be avoided to prevent an interruption of the production process. The present invention provides methods to automatically determine the most frequently used parameters in machine vision solely based on the input image itself. The method is explained in detail using an object recognition system (e.g., EP-A-1 193 642, Ulrich et al. (2003)) as an example. In particular, the model generation process based on a model image of the object is explained. However, also other systems that use edge extraction algorithms, for example, can benefit from the present invention. Consequently, the following description is only illustrative and should not be construed to limit the scope of the invention.
The methods according to the various aspects of the present invention involve the determination of the following parameters:
The contrast of an object in an image. In many object recognition systems the object is described by its edges (e.g., Gunilla Borgefors. Hierarchical chamfer matching: A parametric edge matching algorithm. IEEE Transactions on Pattern Analysis and Machine Intelligence, 10(6): 849-865, November 1988, Rucklidge (1997), U.S. Pat. No. 6,005,978, EP-A-1 193 642, Ulrich et al. (2003)). The determination of the contrast parameter corresponds to finding the optimum threshold for the edge amplitude in the model image. The optimum value is found if all important characteristic details of the object exceed the threshold while noise and less important details fall below the threshold, and hence, are not included in the model. Sometimes, a more sophisticated thresholding operation is applied to the edge amplitude demanding a lower and a higher threshold parameter to be specified. This operation is called hysteresis thresholding (see J. Canny, Finding Edges and Lines in Images: Report, AI-TR-720, M.I.T. Artificial Intelligence Lab., Cambridge, Mass., 1983). All points having an edge amplitude that exceeds the higher threshold are immediately accepted (“secure points”). Conversely, all points with an edge amplitude less than the lower threshold are immediately rejected. “Potential” points with edge amplitudes between both thresholds are accepted if they are connected to “secure” points by a path of “potential” points. The present invention provides a method for automatically determining one threshold value if the conventional thresholding operation should be used and a method for automatically determining two threshold values if the hysteresis thresholding operation should be used.
The minimum size of object parts. In order to increase the robustness of the recognition process, it is often useful to eliminate small object parts from the model. Small object parts are more susceptible to image noise, and therefore make a stable recognition more difficult. The present invention provides a method for automatically determining the minimum size of object parts that are included in the model.
The model point reduction. In most recognition approaches, the speed of the recognition process depends on the number of points that are stored in the object model. Thus, to speed up the recognition process, the number of model points should be reduced when dealing with large objects that would lead to a high number of model points. The degree of the point reduction is automatically computed by the method according to the presented invention.
The minimum contrast of image structures. Image structures having an edge amplitude below the minimum contrast should be interpreted as image noise and should neither be included in the model nor influence the recognition process. Typically, the minimum contrast is significantly smaller than the contrast of the object. The present invention provides a method for automatically determining the noise in the model image and deriving the minimum contrast based on the estimated image noise.
The discretization step length. Object recognition approaches often discretize the pose space and transform the model in accordance with each discrete pose. A similarity measure can be used to compare the discrete poses of the model with the run-time image, in which the object should be recognized. The object is found at a given pose if the similarity for this pose exceeds a threshold. The dimensionality of the pose space depends on the transformations the object may undergo in the run-time image. In the case of 2D-translations the pose space has two dimensions, in the case of rigid transformations it has three dimensions (+1 rotation), in the case of similarity transformations it has four dimensions (+1 scaling), etc. It is obvious that the discretization step length of the translations can easily be set to 1 pixel in accordance with the pixel grid. Unfortunately, for the remaining dimensions (e.g., rotation, scaling) a comparable natural discretization step length is not available. In the present invention a method is described that can be used to automatically determine the optimum discretization step lengths for the rotation, the scaling, and further transformations of the object model.
All these parameters can be automatically determined by the methods according to the present invention solely based on a single model image of the object.