Machine vision generally relates to finding and/or locating patterns in images, where the patterns generally correspond to and/or represent real-world objects in the field of view of an imaging device, whether based on an image of the object or a simulated representation of the object, such as a CAD drawing. Pattern location methods and systems are of particular importance in industrial automation, where they are used, for example, to guide automation equipment and for quality control, where the objects might include, for example, semiconductor wafers, automotive parts, pharmaceuticals, etc. Machine vision enables quicker, more accurate and repeatable results to be obtained in the production of both mass-produced and custom products. Basic machine vision systems include one or more cameras (typically having solid-state charge couple device (CCD) as imaging elements) directed at an area of interest, appropriate illumination on the area of interest, frame grabber/image processing elements that capture and/or transmit CCD images, and one or more computer processing units and/or displays for running the machine vision software application and manipulating or analyzing the captured images.
Typical machine vision systems include a training stage and a run-time stage. Training typically involves being provided or receiving a digital image of an example object (e.g., a training image). The objective of training is to learn an object's pattern in an image by generating a model that can be used to find similarly-appearing patterns on production objects or in run-time images at run-time. Run-time typically involves being provided or receiving a digital image of a production object (e.g., a run-time image). The objective of run-time processing is (1) to determine whether the pattern exists in the run-time image (called pattern recognition), and (2) if the pattern is found, to determine where the pattern is located, with respect to one or more degrees of freedom (DOF), within the run-time image. The pattern's location, as defined by the DOFs, can be called the object or pattern's pose in the image. One way to represent a pose is as a transformation matrix mapping between coordinates in the model and coordinates in the run-time image or vice versa. Determining whether a pattern is located in an image can establish the location of the production object so that, for example, it can be operated on by automation equipment.
Training is one of the more important and challenging aspects of any industrial pattern inspection/location system. In a typical production application, for example, a model can be used tens of thousands of times every hour, and any errors or imperfections in the model can potentially affect every single use. The challenge of training arises from several factors. Production objects can vary significantly in appearance from any given example object used in training, due to imperfections in the example object and ordinary manufacturing variations in the production objects and/or the lighting conditions. Nevertheless, the model should be such that the production objects can be found reliably and accurately, while at the same time rejecting objects that do not match the pattern.
In addition, models are typically trained by human operators (e.g., by drawing a box on a training image with a mouse) whose time is expensive and who are not generally experts in the underlying machine vision technology. Alternatively, machine vision systems also allow models to be defined synthetically (e.g., by using a CAD tool). Each of these training implementations suffers from drawbacks that can decrease the effectiveness of the generated models. For example, manually-selected and synthetically-generated models can result in degenerate models (e.g., straight lines) and other non-unique model features. Training machine vision systems based on object images is also typically time-consuming. This becomes especially a problem for manufacturing processes, where there may be a wide variety of products and/or objects that need to be inspected and/or localized using machine vision inspection. Furthermore, product designs may frequently change. Even a minor revision to an object, for example, its shape, may require retraining.
FIGS. 1A-1C illustrate examples of manually trained models from training images and the resulting mis-detections during run-time using the manually-generated models. FIG. 1A illustrates a training image 110 and a user-selected region of interest 111 from which a model 112 is generated (e.g., the region of interest is passed through an edge detection unit to generate a model representing edges). However, model 112 contains only straight lines 112a and 112b in the same direction, which are degenerate and non-unique features, resulting in a secondary result 115 being detected in a run-time image 116. The detected result 115 matches the model features, but is translated and therefore does not represent what the user intended to find. Similarly, FIG. 1B illustrates the training image 110 with a different user-selected region of interest 117 from which a model 118 is generated. Model 118, while non-degenerate, still can result in a degenerate match 119 in the run-time image 120 even though a portion 120 of the model 119 does not appear in the run-time image 120. FIG. 1C illustrates a training image 130 and two user-selected regions of interest 131 from which a model 132 is generated. However, models such as model 132 that include a circle 132a and/or short straight lines 132b oriented in a single direction can be non-unique by rotation, resulting in a secondary result 135 being detected in run time image 136 (where the highlight box 137 illustrates the rotation of the secondary result 135. In other examples, models can be non-unique due to the existence of background features.
One approach to determining the uniqueness of a model involves analyzing all of the results returned during run-time application of the model (e.g., determining how many misdetections occur). The drawback of this approach is failure to detect secondary results (e.g., results that are not the highest scoring or do not surpass a certain threshold score) in a robust manner due to the existence of non-linearities of the machine vision tools used. Other approaches include simple alerts based on whether a model consists of a single straight line or a single circle. The drawback of these approaches is that they do not address the general issue of how unique a model is in a given search range.