The present invention relates to a method of and apparatus for detecting the position of an object, and also relates to an image processing system using the object position detecting method and apparatus.
There are strong needs for image recognition apparatuses, including visual apparatus for industrial robots and product inspection apparatus for automation lines. Accordingly, development of recognition apparatuses such as those based on a correlator that uses an optical matched spatial filter has heretofore been exhaustively carried out. Recently, recognition apparatuses that use a neural network have also been actively developed. However, these conventional recognition apparatuses are inferior in generalizability, that is, capability of recognizing images having been subjected to deformation, e.g., shift, rotation, scaling, etc. Therefore, an improvement in the generalizability has been the most important problem to be solved to allow the recognition apparatuses to be put to practical use.
One solution to the problem was proposed by David Casasent et al. with Carnegie Melon University (see David Casasent et al. "Real-time deformation invariant optical pattern recognition using coordinate transformations", Appl. Opt., Vol. 26, pp. 938-942 (1987)). Their method, which uses a correlator, is as follows: As shown in FIG. 14, first, an input image 101 is illuminated with coherent light 102, and phase information for a coordinate transformation is superimposed on the input image information by a computer-generated hologram (CGH) 103. Then, the resulting image information is subjected to Fourier transform by a Fourier transform lens L.sub.1 104. Thus, information containing the input image having been subjected to a desired coordinate transformation is obtained on a coordinate transformation surface 105. In the case of the above-mentioned literature, for example, the desired coordinate transformation is a logarithmic polar coordinate transformation wherein even if deformation such as scaling or rotation occurs on the input image, such deformation is transformed into an amount of shift.
Next, as shown in FIG. 15, the information having been subjected to the logarithmic polar coordinate transformation is input to a liquid crystal television (LCTV) 106 and illuminated with coherent light 110, thereby being made incident on a shift-invariant correlation optical system comprised of a combination of a double-diffraction system, which is composed of two Fourier transform lenses L.sub.2 107 and L.sub.3 108, and a matched spatial filter (MSF) 109 formed by subjecting a reference image to a logarithmic polar coordinate transformation similar to the above. Then, a correlation peak of the reference image and the inspective image on a correlation plane P is detected by a camera 111, thereby recognizing the optical patterns of the input image. Actually, Casasent et al. report in an example that favorable collating results were obtained with respect to rotation and scaling deformation by using the above-described method.
Further, Kenneth H. Fielding et al. with US Air Force Institute of Technology propose a recognition method attained by developing the above-described method and report in an example that favorable collating results were obtained with respect to shift deformation in addition to rotation and scaling by transforming the input image into a Fourier spectrum (the product of the Fourier transform of the input image and its conjugate information) before subjecting it to processing similar to the above, although no detailed description of the method is made in their report (see Kenneth H. Fielding et al. "Position, scale and rotation invariant holographic associative memory", Opt. Eng., Vol. 28, pp. 849-853 (1989)).
According to the method of Fielding et al., attained by developing the method of Casasent et al. so as to add thereto generalizability for shift deformation, the Fourier spectrum of the input image is subjected to a coordinate transformation before being used for recognition. However, information in the vicinity of zero-order light, which has a high spectral intensity, is relatively similar to one another irrespective of the type of input image. Therefore, recognition of the input image is largely affected by the information in the vicinity of zero-order light even if there is a difference in the other portions of the input image, resulting in a large recognition error. Further, because of the operation of obtaining a product of the Fourier transform of the input image and its conjugate information, which is added to the process of obtaining a Fourier spectrum, Fourier transform information on the input image is lacking information about phase and information about negative amplitude. Thus, recognition cannot be performed with complete information. Therefore, a large recognition error may occur depending upon the type of input image.