1. Field of the Invention
The present invention relates to multi-eye image pickup apparatus provided with motion vector detecting means necessary for image coding apparatus or image blur correcting apparatus, and three-dimensional object shape measuring or recognizing method and apparatus for measuring or recognizing the environment, obstacles, topography, industrial articles, etc. in non-contact therewith from image data.
2. Related Background Art
Motion vector detecting methods for obtaining a moving amount of an object picked up, from image signals picked up in time series are necessary and indispensable for image coding apparatus, image blur correcting apparatus, etc. Specific examples of the vector detecting methods are the temporal-spatial high gradient method as described in the specification of U.S. Pat. No. 3,890,462 or the bulletin of Japanese Patent Publication No. 60-46878, the correlation method based on correlation calculation, the block matching method (template matching method), etc. Among these techniques, the temporal-spatial high gradient method is discussed in detail by B. K. P. Horn et al. in Artificial Intelligence 17, p185-203 (1981); the block matching method is discussed in detail by Morio ONOE et al. in Information Processing (Joho Shori) Vol. 17, No. 7, p634-640, July 1976. These motion vector detecting methods will be briefly described below.
First described is the temporal-spatial gradient method. Letting d(x, y, t) be a luminance at point (x, y) on an image at time t and (x+.delta.x, y+.delta.y) be a position of the point after a lapse of infinitesimal time .delta.t, the luminance d(x, y, t) is expressed as follows. EQU d(x, y, t)=d(x+.delta.x, y+.delta.y, t+.delta.t)
Then Taylor expansion of d(x+.delta.x, y+.delta.y, t+.delta.t) yields the following. EQU d(x+.delta.x, y+.delta.y, t+.delta.t)=d(x, y, t)+.delta.x(.differential.d/.differential.x)+.delta.y(.differential.d/.dif ferential.y)+.delta.t(.differential.d/.differential.t)+ . . .
Omitting the higher terms in this equation, the following is derived. EQU 0=.delta.x(.differential.d/.differential.x)+.delta.y(.differential.d/.diffe rential.y)+.delta.t(.differential.d/.differential.t)
Here, putting (.differential.d/.differential.x)=dx, (.differential.d/.differential.y)=dy, (.differential.d/.differential.t)=dt, the following equations hold. EQU 0=.delta.xdx+.delta.ydy+.delta.tdt EQU 0=(.differential.x/.differential.t)dx+(.differential.y/.differential.t)dy+d t
putting (.differential.x/.differential.t)=u and (.differential.y/.differential.t)=v, the following equation is finally obtained. EQU udx+vdy+dt=0
Here, (u, v) corresponds to a moving amount of a pixel on a screen.
Thus, if v=0 is known (for example, if the motion is horizontal) the above equation gives the following. EQU u=-(dt/dx)
Thus, u can be obtained.
An example using the temporal-spatial gradient method is described, for example, in the above bulletin of Japanese Patent Publication No. 60-46878, in which a gradient e is obtained of an image signal level corresponding to an arbitrary position of image and a change d in a fixed time period is further obtained of the image signal level corresponding to the arbitrary position, whereby a moving amount of image at the above arbitrary position in the fixed time period is obtained from a value of d/e.
Next described is the block matching method. The block matching method is a method which includes splitting one of two frames picked up in time series, which are used in extraction of displacement vectors, into a plurality of blocks of an appropriate size (for example, 8 pixels.times.8 lines), calculating deviation amounts between pixels in each block and pixels in a certain range in the other frame (or field) every split block, and searching a position of a block in the other frame (or field) to minimize a sum of absolute values of the deviation amounts calculated. Namely, in this block matching method, a relative deviation of each block in each frame represents a motion vector of the block.
In conventional multi-eye image pickup apparatus, motion vectors of image are detected between frames at a predetermined time for each camera, using either one of the above-described motion vector detecting methods. Namely, correspondent points are obtained between frames in the predetermined time period from image signals input in time series through each camera and motion vectors of image at the correspondent points are extracted for each camera. The above-described motion vector detecting methods each are based on the premise that there exist correspondent points between images for detecting a motion of image.
Incidentally, a widely used method as a conventional 3D shape measuring/recognizing method using image data is one which measures a shape according to the principle of triangulation, so-called stereo distance-measuring method, using spatial positional information obtained from correspondence relations of points in a target object in a plurality of image data picked up from different visual points to the target object. As one of the stereo distance-measuring methods there is a method for realizing a pseudo-stereo distance-measuring method using time-serial image data picked up as changing visual points with a single eye.
Also, there is a method for measuring a three-dimensional shape of an object, using time-serial images picked up as changing angles of view with a single eye (cf. the bulletin of Japanese Laid-open Patent Application No. 5-209730). This method will be called a zoom distance-measuring method.
Next described referring to FIG. 26 and FIG. 27 are principles of distance measurement in the above stereo distance-measuring method and zoom distance-measuring method.
FIG. 26 is a drawing for illustrating the principle of triangulation as a basic principle in the stereo distance-measuring method. In the drawing, symbols D, f, B, h.sub.L, and h.sub.R represent an object distance, a focal length, a baseline, an image position of an object in a left pickup system, and an image position of the object in a right pickup system, respectively.
From the drawing it is obvious that the following geometric relation holds. Namely, EQU (h.sub.R -h.sub.L):f=B:(D-f) (1)
Solving Equation (1) with respect to the object distance D, EQU D=(B.multidot.f)/(h.sub.R -h.sub.L)+f (2)
In Equation (2), B, f are know constants, and thus, a parallax (h.sub.R -h.sub.L) needs to be detected on an image pickup plane in order to obtain the object distance D. For this, normally performed is image processing such as the matching method or the gradient method, based on correlation calculation (cf. the bulletin of Japanese Patent Publication No. 60-46878).
The distance-measuring principle of the zoom distance-measuring method is next described using FIG. 27. In the drawing, D, f.sub.w, f.sub.t, h.sub.w, h.sub.t, and H represent an object distance, a focal length upon wide-angle, a focal length upon telephoto, an image position of an object upon wide-angle, an image position of the object upon telephoto, and a position of the object, respectively. From the geometric relation of the drawing, the following relations hold. EQU h.sub.w :f.sub.w =H:(D-f.sub.w) (3) EQU h.sub.t :f.sub.t =H:(D-f.sub.t) (4)
Eliminating H and solving these simultaneous Equations (3) and (4) with respect to D, the following is obtained. EQU D={(h.sub.w -h.sub.t).multidot.f.sub.w .multidot.f.sub.t }/{(h.sub.w .multidot.f.sub.t)-(h.sub.t .multidot.f.sub.w)} (5)
When the focal lengths in zooming are known, unknowns in the above Equation (5) are two of h.sub.w, h.sub.t. Thus, similarly as in the above stereo distance-measuring method, the positions of the object on the image pickup plane need to be correctly detected by image processing.
Among the above-described stereo distance-measuring methods (also called as multi-viewpoint image pickup methods or multi-viewpoint distance-measuring methods), next described using FIG. 26, FIG. 28 are the distance-measuring principle of the normal stereo distance-measuring method with two image pickup apparatus (hereinafter referred to as cameras) horizontally arranged and the distance-measuring principle of the front-to-back (parallax) stereo distance-measuring method utilizing image data picked up as moving a single camera back and forth along the optical axis. The former corresponds to the basic principle of multi-viewpoint image pickup methods of a horizontal/vertical plane parallax type whereas the latter to the basic principle of multi-viewpoint image pickup methods of a front-to-back parallax type.
Here, the above "multi-viewpoint image pickup method of the horizontal/vertical plane parallax type" is for recognizing an object according to the triangulation rule using images picked up from different visual points located in parallel with the object, for example as effected by the human eyes or stereo image pickup apparatus. For brevity, this method will be referred to simply as a "stereo image pickup method" or "stereo distance-measuring method." On the other hand, the above "multi-viewpoint image pickup method of the front-to-back parallax type" is a method using images picked upon from different visual points back and forth along the optical axis without changing the optical axis of camera. Hereinafter, this technique will be referred to as a "front-to-back stereo image pickup method" or "front-to-back stereo distance-measuring method" for brevity.
FIG. 26 is a drawing for illustrating the basic principle of the stereo distance-measuring method, i.e., the principle of triangulation, as described above. In the drawing, the symbols D, f, B, h.sub.L, and h.sub.R represent the object distance, the focal length, the baseline, the image position of object (target object) in the left pickup system, and the image position of object in the right pickup system, respectively. From the drawing, it is obvious that the following geometric relation holds. Namely, EQU (h.sub.R -h.sub.L):f=B:(D-f) (11)
Solving Equation (11) with respect to the object distance D, the following is obtained. EQU D={(B.multidot.f)/(h.sub.R -h.sub.L)}+f (12)
In Equation (12), B, f are known constants, and thus, for obtaining the object distance D, it is necessary to detect the parallax (h.sub.R -h.sub.L) on the image pickup plane. Thus, normally performed is the image processing such as the matching method or the gradient method, based on correlation calculation (cf. the bulletin of Japanese Patent Publication No. 60-46878).
Next described referring to FIG. 28 is the distance-measuring principle of the above front-to-back (parallax) stereo distance-measuring method. In FIG. 28, there are symbols H, D, D', h, h', f, among which H is a height from a gazed point on the object to the optical axis, and D, D' are distances between the object and image pickup planes, wherein D is a distance with the camera at a near point to the object and D' a distance with the camera at a far point to the object. Further, h, h' are image points on the image pickup planes at D and D', respectively, and f is the focal length. From the geometric relation in FIG. 28, the following relations hold. EQU H:(D-f)=h:f (13) EQU H:(D'-f)=h':f (14)
From Equation (13), the following equation is obtained. EQU H.multidot.f=h.multidot.(D-f) (15)
From Equation (14), the following equation is obtained. EQU H.multidot.f=h'.multidot.(D'-f) (16)
Here, from (15)=(16), the following is obtained. EQU h.multidot.(D-f)=h'.multidot.(D'-f) (17)
Let us here suppose a moving amount of camera (front-to-back difference) (D'-D) is known. Letting B be the front-to-back difference (D'-D) and solving Equation (17) with respect to D, the following steps result. ##EQU1## Then EQU D={(B.multidot.h')/(h-h')}+f (18)
From Equation (18), the distance D to the object can be thus calculated if a moving amount (h-h') of the gazed point on the image pickup plane is obtained.
The above description showed the difference between the distance-measuring principles of the left-to-right parallax stereo distance-measuring method and the front-to-back parallax stereo distance-measuring method using FIG. 26 and FIG. 28. From Equations (12) and (18) it is seen that either case results in a correspondence problem of each point between left and right images or between front and back images.