In standard 2-D machine vision, a video camera and frame grabber are used to acquire a digital image of an object or scene to be stored in computer memory. A computer then analyzes the image to inspect or orient the object, read identifying marks, etc. In many applications, standard light intensity images do not provide sufficient information for the machine vision task. For example, when inspecting solder bumps on semiconductor packages, it is necessary to gauge (measure) bump height to determine coplanarity of a plurality of bumps and perhaps the solder volume of each bump. Due to the scale and speed requirements of such applications, non-contact gauging methods are preferred. Thus, a 3-D range image is desired, wherein each range image element is a range (height) measurement instead of a 2-D reflectance image, wherein each reflectance image element (pixel) is a reflectance or brightness measurement.
Range images can be obtained by many methods. An important class of methods combines multiple 2-D video camera images to obtain a range image, such as in stereo vision. Another technique using 2-D images involves comparison of the degree of focus or defocus in two or more images taken at different distances from the object under inspection. In this class of methods, the depth of field characteristic of the imaging lens is used to determine range. For example, Nayer and Nakagawa in "Shape from Focus", IEEE trans. Pattern Analysis and Machine Vision, vol. 16, no.8, pp.824-831, August 1994, describe a depth from focus system using multiple images taken with different camera to object distances. In S. K. Nayer, M. Watanabe, M Nougouchi, "Real-Time Focus Range Sensor", IEEE trans. Pattern Analysis and Machine Vision, vol. 18, no.12, pp.1186-1196, December 1996, a depth from defocus system is described using the relative degree of defocus in two images taken at different focal distances to compute a range image.
To compare the degree of focus or defocus at a point on an object using multiple images of the object, each point on the object must be precisely located at the same position in each of the images. In other words, the depth from focus and defocus techniques require that the set of multiple images of the object have point-to-point correspondence. However, conventional imaging optics change magnification as focal distance changes, and consequently, exact point-to-point correspondence among the images does not occur. M. Watanabe and S. K. Nayer in "Telecentric Optics for Constant-Magnification Imaging", Report CUCS-026-95, Department of Computer Science, Columbia University, New York, N.Y., September 1995, have suggested use of telecentric optics to overcome this problem. Telecentric optics provides a constant magnification over depth of field. Watanabe and Nayer also suggests a procedure for converting a normal lens so as to provide telecentric operation by adding an external aperture.
Although the use of perfectly telecentric lenses solves the correspondence problem in depth from focus and defocus systems, the use of such telecentric lenses has many drawbacks. For example, adding an external pupil to a regular lens to obtain telecentric operation creates a vignetting condition. This is due to rays being unequally clipped by the external aperture over the lens field of view, and is equivalent to the lens f/# changing over field. The consequence of this is that lens depth of field changes over field of view, thereby changing the range detection function locally over the images. Such effects are difficult or impossible to remove by calibration procedures. To overcome this problem, a custom lens or lens attachment can be designed. However, custom optics are expensive to produce. Also, fully telecentric lens system designs are fixed at a single set of operating conjugates, so that field of view (magnification) is fixed. This is very limiting in machine vision applications, where a single sensor must be useful over a range of object sizes.
Image warping has been suggested as a means of correcting for non-correspondence among a set of images in range imaging applications. Here, "image warping" is defined as any image transformation which alters the spatial relationship between points in an image. In the case of correction of correspondence error due to magnification shift, the required transformation (warp) includes the operations of translation and scaling. In the correction of a set of `n` images, `n-1` images would be scaled and translated to match a chosen image in the set, to provide a final set of `n` images with precise geometric correspondence.
In depth from focus and depth from defocus systems, it is the image components of highest spatial frequency which convey precise focal information. For example, Nougouchi and Nayer in "Microscopic Shape from Focus Using Active Illumination", 12the Proceedings IAPR International Conference on Pattern Recognition, pp.147-152, Jerusalem, Israel, October, 1994, have suggested the use of high frequency structured illumination to superimpose high frequency texture on objects otherwise devoid of such natural surface texture. Using this method, the needed high frequency components are always present. Image warping requires the resampling of image picture elements to alter the spatial relationship between the elements. This involves the use of interpolating filters to generate new pixels lying between pixels on the original sampling grid, a process which unavoidably results in high frequency image information loss, as described in Wolberg, G., "Digital Image WNarping", IEEE Computer Society Press, Los Alamitos, Calif., 1990. This loss is of precisely those image frequency components needed for accurate focal analysis in depth from focus and depth from defocus methods. Inrterpolating filter fidelity can be improved by using filters with larger kernel extent, but this approach is computationally expensive. Therefore, the use of image warping for correspondence correction has been judged impractical by past investigators.