The ability to determine the position and/or attitude of an object is of considerable practical importance in many applications, including industrial manufacturing, robot guidance, intelligent transportation systems, mail and parcel handling, and many others. Position might include up to three degrees of freedom (e.g., up-down, in-out, and side-ways), and the three degrees of freedom of attitude (e.g., pitch, yaw, and roll). Together, these six degrees of freedom can be used to describe what can be referred to as the “location” or “pose” of an object. Note that here, “location” or “pose” can mean more than just position in three spatial dimensions—it can also include information as to the other non-translational degrees of freedom, such as skew, perspective, aspect-ratio, and many other more exotic non-translational degrees of freedom. In a two-dimensional plane, position includes two translation degrees of freedom (e.g., up-down and side-ways), and at least one non-translation degree of freedom, such as “orientation” (a rotation degree of freedom), and possibly skew, aspect ratio, size, and others.
It is well-known to use machine vision to locate objects at a distance. A digital image of a scene containing an object to be located is formed by any suitable apparatus, for example consisting of visible light illumination, a CCD camera, and a video digitizer. The digital image is then analyzed by a suitable image analysis device, for example consisting of a digital signal processor or personal computer running software that implements a suitable method for identifying and locating image patterns that correspond to the object of interest. The analysis results in certain parameters that describe the pattern in the image that corresponds to the object, or a suitable portion of the object, in the scene. These parameters might include position, attitude, and size of the pattern in the image. These parameters are then used to compute the location (pose) of the object using well-known mathematical formulas.
There are many methods known in the art for analyzing digital images to determine one or more of the pattern parameters, including blob analysis, normalized correlation, Hough transforms, and geometric pattern matching. Numerous other methods have been used or proposed in commercial practice or in academic literature.
The process of determining object location (pose) by machine vision can be referred to in various ways, including “alignment”, “registration”, “pattern recognition”, and “pattern matching”. For present purposes herein, those terms are equivalent, and so herein the term “alignment” shall be used to refer to any such process. Any object, or portion of an object, that gives rise to the pattern in the image to be analyzed shall be called an “alignment target”, or simply a “target”.
Most machine vision alignment applications require locating targets having a shape determined by engineering considerations that are largely independent of the needs of automated visual alignment. In these cases, the objects contain no special markings or components specially adapted to aid the alignment method. Consequently, the alignment method must work with whatever object shape is given. There are many applications, however, where the alignment target can be engineered specifically for that purpose. Examples include fiducial marks on printed circuit boards, registration marks etched on silicon wafers, and “bull's eye” targets used by the United Parcel Service on package labels. A target that has been engineered to aid machine vision alignment can be called a “cooperative target.” In contexts where it is clear that a target has been engineered for alignment, the modifier “cooperative” is sometimes omitted, but “cooperative” is understood.
Although alignment methods have an extensive literature and commercial history, relatively little work has been done on understanding the effect of target shape on alignment performance. The work is almost entirely restricted to shapes composed of circles and polygons, to the effect of such shapes on binary image analysis methods, to translation-only (i.e., horizontal and vertical) alignment, and to accuracy criteria only.
Rotationally symmetric targets, primarily circles and “bull's-eye” patterns, have long been a favorite in the academic literature. In a 1974 paper, for example, W. Makous “Optimal Patterns for Alignment”, in Applied Optics, Vol. 13, No. 3, states that “a bull's-eye pattern of regularly alternating black and white rings would be optimal for visual alignment in two dimensions.” Twenty four years later, in a 1998 paper entitled “Design of Shapes for Precise Image Registration”, in IEEE Trans. on Information Theory, Vol. 44, No. 7, Bruckstein, O'Gorman, and Orlitsky state that “Experimental tests and . . . theoretical developments . . . led to the conclusion that the ‘bull's-eye’ fiducial is indeed a very good, robust and practical location mark.”
Rotationally symmetric targets suffer from a number of limitations, however, that have not been anticipated in the prior art. First, such targets contain no information for measuring orientation. This has been considered an advantage, based on the assumption that alignment methods would fail under orientation misalignment unless the target is rotationally symmetric, but the recent advent of practical methods for orientation alignment have created a need for targets that convey substantial orientation information.
A second limitation of rotationally symmetric targets, such as the “bull's eye” pattern, is that circles and arcs of circles are extremely common in manufactured items, and one cannot guarantee that such shapes will not appear in the field of view containing the target. The appearance of such a shape in the same field of view as a target composed of circles or arcs of circles results in potential confusion for the alignment method, and this confusion usually leads to higher recognition error rates under variations in image quality typically encountered in an industrial environment.
A third limitation of rotationally symmetric targets is that they are often not good choices for measuring size, what might be called “size alignment”. While such a target does contain plenty of information for conveying size, the concentric circular boundaries match each other perfectly at many different sizes. At the correct size the overall target match will be higher than at any of the wrong sizes, but the matches at the wrong sizes are sometimes good enough to create confusion under realistic conditions of image degradation. This “self-confusion” can lead to higher recognition error rates. Furthermore, this self-confusion generally requires that any practical alignment method must examine the “size” degree of freedom more carefully to avoid error, which increases recognition time.
The academic literature has also considered using as alignment targets simple polygons such as squares and diamonds, as well as complex sequences of stripes that are optimal for 1D or 2D alignment in some theoretical sense, but are almost impossible to manufacture.
Known targets in commercial use include simple geometric shapes such as circles, bull's-eyes, squares, crosses, two squares touching at a corner, and patterns consisting of a cross embedded in a circle.
In the semiconductor industry, significant attention has been given to the engineering of targets used to achieve the extreme accuracy needed to register the many layers created during wafer processing. Early targets consisted of interleaved comb structures, which were used by human operators in manual alignment systems prior to the advent of machine vision alignment. More recently, manufacturers have used squares, concentric “box-in-box” shapes, crosses, circles, rings, bull's-eyes, and various other shapes comprised of rectilinear or circular features. Much of the prior art in semiconductors is concerned with process issues such as 3D structure, edge profiles, circuit design rules, and resist flow.
Prior art cooperative targets suffer from one or more of the following limitations:
                Target features are sometimes inadequate for providing sufficient information regarding non-translation degrees of freedom (e.g., orientation and size).        Reduced information is available when straight-line features are aligned (accidentally or otherwise) with the pixel grid.        Confusion and consequent reduced reliability result from use of circles, circular arcs, line segments, or right angles, which are common in manufactured objects, and therefore may be confused by the alignment method with other patterns in the scene.        Reduced reliability results from use of fine target features that do not survive a manufacturing process.        Confusion and consequent reduced reliability result from target shapes that are “self-confusing”, i.e., that match themselves too well when translated, rotated, or changed in size.        Reduced alignment speed results from target shapes that cannot be identified unambiguously by their coarsest features.        
For a number of reasons these limitations generally have not been serious for past use of machine vision. The alignment methods that have been available in the past, such as blob analysis and normalized correlation, had not been accurate enough (i.e., could extract only limited information from an image) to expose subtle limitations of the targets. Few practical methods existed for determining non-translation degrees of freedom, and those that were known were not widely used due to cost, reliability, or performance problems. Machine vision was often a new and challenging manufacturing technology, and so emphasis was placed on basic functionality and not on squeezing high performance from the equipment. In electronics and semiconductor applications, among the largest users of machine vision, the coarser device geometries of the past placed limited demands on machine vision alignment.
Recent developments have created a need for a new breed of cooperative targets:    The commercial availability of practical, highly accurate alignment methods capable of aligning non-translation degrees of freedom, including orientation and size, has created a need for targets engineered to provide sufficient information in all such degrees of freedom.    Expanding experience with machine vision alignment has shown that increased error rates often result from targets that can be confused with similar shapes in the field of view, or are self-confusing in one or more degrees of freedom. Thus, there is a need for targets not based on common features such as lines, right-angles, and circles, and for targets specifically engineered to minimize or eliminate self-confusion in all degrees of freedom.    Shrinking sizes and tighter tolerances in manufactured goods, particularly in semiconductors and electronics, are placing increasing demands on accuracy, speed, and robustness of machine vision alignment. Consequently, there is a need to reconsider the often-neglected role of target shape, and produce targets that cooperate with practical alignment methods to achieve best performance.The need for new cooperative targets not subject to the limitations of the prior art leads to a need for new methods for engineering such targets. Prior art methods for engineering targets suffer from one or more of the following limitations:            The known target engineering methods address only translation alignment, not other degrees of freedom such as orientation and size.        The known target engineering methods are based on a theoretical analysis of absolute accuracy, which to avoid intractable complexity, requires unrealistic simplifying assumptions about image quality, requires that target shape be restricted to simple shapes composed of circles and rectilinear edges, and requires the use of simple binary alignment methods, such as blob centroid.        The known target engineering methods do not consider alignment speed.        The known target engineering methods do not consider alignment reliability under practical conditions of image variation and process degradation.        The known target engineering methods do not consider confusion with other patterns and self-confusion.Consequently there is a need for a new method for engineering alignment targets.        
Another factor contributing to reduced performance of known cooperative alignment targets is insufficient precision in known methods for rendering such targets. Without the ability to render a cooperative alignment target at very high precision, an alignment target engineered to address the problems of the prior art would not perform optimally. Consequently, it is necessary to be able to render such targets in various forms, including bitmap images, at very high precision. In the prior art, methods for rendering shapes accurately on a discrete grid have been studied extensively for graphics applications, where they are generally referred to as anti-aliasing methods. These methods produce pleasing graphics for human observation, but achieving the extreme accuracy and flexibility needed for machine vision alignment applications is difficult.
In the machine vision prior art, several methods have been used to render binary shapes on discrete grids. In one method, pixels along the shape boundary are given a gray value corresponding to the fraction of the pixel's area that falls on either side of the boundary. This method suffers from several limitations:                The computations are complex, resulting in very slow rendering that makes automated testing using thousands of synthetically generated images impractical.        The method assumes an unrealistically ideal sensor model, resulting in loss of accuracy. Attempts to improve the sensor model by post-processing the rendered image are also limited in accuracy due to grid quantization.        The method is impractical for complex shapes that are not composed of straight-line segments.        
In another known rendering method, a binary image is rendered at much higher resolution than that needed for the target rendering. This high-resolution image is then filtered and sub-sampled to produce the final rendering. While such a method can be quite accurate in principal, computer time and memory limitations make truly high accuracy impractical.