Machine vision systems, also termed “vision systems” herein, are used to perform a variety of tasks in a manufacturing environment. In general, a vision system consists of one or more camera assemblies with an image sensor (or “imager”) that acquires grayscale or color images of a scene that contains an object under manufacture. Images of the object can be analyzed to provide data/information to users and associated manufacturing processes. The data produced by the camera is typically analyzed and processed by the vision system in one or more vision system processors that can be purpose-built, or part of one or more software application(s) instantiated within a general purpose computer (e.g. a PC, laptop, tablet or smartphone).
Common vision system tasks include alignment and inspection. In an alignment task, vision system tools, such as the well-known PatMax® system commercially available from Cognex Corporation of Natick, Mass., compare features in a two-dimensional (2D) image of a scene to a trained (using an actual or synthetic model) 2D pattern, and determine the presence/absence and pose of the 2D pattern in the 2D imaged scene. This information can be used in subsequent inspection (or other) operations to search for defects and/or perform other operations, such as part rejection.
A particular task employing vision systems is the alignment of a three-dimensional (3D) target shape during runtime based upon a trained 3D model shape. 3D cameras can be based on a variety of technologies—for example, a laser displacement sensor (profiler), a stereoscopic camera, a sonar, laser or LIDAR range-finding camera, time-of-flight camera, and a variety of other passive or active range-sensing technologies. Such cameras produce a range image wherein an array of image pixels (typically characterized as positions along orthogonal x and y axes) is produced that also contain a third (height) dimension for each pixel (typically characterized along a z axis perpendicular to the x-y plane). Alternatively, such cameras can generate a point cloud representation of an imaged object. A point cloud is a collection of 3D points in space where each point i can be represented as (Xi, Yi, Zi). A point cloud can represent a complete 3D object including the object's back and sides, top and bottom. 3D points (Xi, Yi, Zi) represent locations in space where the object is visible to the camera. In this representation, empty space is represented by the absence of points.
By way of comparison, a 3D range image representation Z(x, y) is analogous to a 2D image representation I(x, y) where the depth or height Z replaces what would be the brightness/intensity I at a location x, y in an image. A range image exclusively represents the front face of an object that is directly facing a camera, because only a single depth is associated with any point location x, y. The range image typically cannot represent an object's back or sides, top or bottom. A range image typically has data at every location (x, y) even if the camera is free of information at such locations. It is possible to convert a range image to a 3D point cloud in a manner clear to those of skill.
In aligning a target image, either acquired or generated by a synthetic (e.g. CAD) process, to a model image (also either acquired or synthetic), one approach involves the matching/comparison of the target 3D point cloud to the model in an effort to find the best matching pose. The comparison can involve a scoring of the coverage of the target with respect to the model. A score above a certain threshold is considered an acceptable match/pose-estimation, and this information is used to generate an alignment result. It is nevertheless challenging to accurately and efficiently generate an alignment result based upon 3D images.
Aligning 3D objects in 3D range images or 3D point cloud images is best accomplished with one or more, respective, 3D alignment (registration) algorithm(s) that is/are appropriate for the 3D shape of those objects. If an inappropriate 3D alignment algorithm is used, the 3D alignment procedure may fail or perform poorly either by finding an incorrect result pose or finding no result at all. Current approaches typically mandate that the user understand the details of which algorithm is appropriate for which objects of interest, or application situation, and manually choose the appropriate alignment algorithm provided by the vision system interface at setup (or alternatively, assemble the algorithm from a choice of modules provided by the vision system).