1. Field of the Invention
The present invention relates to a method for extracting a corresponding point in plural images, and more particularly a method for extracting corresponding points in plural images by obtaining a parallax line through a local calculation involving excitatory and inhibitory couplings based on real pixels of plural images, at each crossing point on a parallax image formed by plural epipolar lines respectively extracted from the plural images.
2. Related Background Art
As the image processing technology employing multiple-eye image taking system, there are already known a method of extracting three-dimensional information based on trigonometry, and a method of synthesizing a high precision image. Such method of synthesizing a high precision image for example utilizes, for attaining a higher resolution:
(1) a method of increasing the resolution spatially by swinging a solid-state image pickup device;
(2) a method of dividing the entering light, taking portions of the object image respectively with plural solid-state image pickup devices and synthesizing these areas; and
(3) a method of utilizing an optical shift by double refraction. However, any of these methods (1) to (3) depends on the time-spatial division of the entering light, and result in a loss as to the S/N ratio. For this reason, the present inventors have proposed a method of obtaining a higher resolution utilizing already known two image pickup devices, and a method of forming a high fine image from two images taken from different angles. The method of obtaining a high fine image can be divided into:
(1) a process of estimating the corresponding relationship of the images to input a value into each corresponding position; and
(2) a process of transformation from the pixels, obtained with aberrations by the above-mentioned process (1), into a form as if obtained with sampling at equal distances (as if obtained with an image pickup device of a high resolution).
Among these, the process (1) can be considered as equivalent to the extraction of corresponding points in the stereoscopic method with both eyes.
FIG. 18 is a view showing the principle of trigonometry utilized for obtaining an image at a distance. In the following description, the sensors of the right-side camera and of the left-side camera are assumed to be positioned on positive planes, unless otherwise specified.
Trigonometry is used, by taking the images of an object in three-dimensional space with two (right- and left-side) cameras, to determine the three-dimensional coordinate of a point P on the object. From the projection points P.sub.R, P.sub.L thereof respectively on the sensor planes A.sub.SR, A.sub.SL of the right- and left-side cameras, having center points O.sub.R, O.sub.L of the respective lenses thereof. In the following description, following definitions will be used for:
(1) "Baseline B" is defined by a line connecting the center points O.sub.R and O.sub.L of the lenses of the right- and left-side cameras;
(2) "Baseline length L.sub.B " is defined by the length of the baseline B;
(3) "Epipolar plane Ae" is defined by a plane formed by connecting the point P on the object and the projection points P.sub.R and PL; and
(4) "Epipolar line (visual axis image) L.sub.eR " is defined by the crossing line of the epipolar plane Ae and the sensor plane A.sub.SR of the right-side camera, and "Epipolar line L.sub.eL " is defined by the crossing line of the epipolar plane Ae and the sensor plane A.sub.SL of the left-side camera.
As shown in FIG. 19, the original point O(O, O, O) is taken at the middle point of the baseline B; x-axis taken along the baseline B; y-axis (not shown) taken perpendicularly to the plane of drawing paper; z-axis taken perpendicularly to the baseline B; focal length of the lenses of the right- and left-side cameras taken as f; and coordinates of the point P on the object and of the projection points P.sub.R, P.sub.L taken respectively as (x.sub.p, y.sub.p, z.sub.p), (x.sub.PL, y.sub.PL, z.sub.PL) and (x.sub.PL, y.sub.PL, z.sub.PL). If the optical axes of the right- and left-side cameras are perpendicular to the baseline B as shown in FIG. 19 (namely if said two optical axes are mutually parallel), there stand following relationships: EQU (x.sub.PL +L.sub.B /2)/f=(x.sub.P +L.sub.B /2)/z.sub.P ( 1.1) EQU (x.sub.PR -L.sub.B /2)/f=(x.sub.P -L.sub.B /2)/z.sub.P ( 1.2) EQU y.sub.L /f=y.sub.R /f=y/z.sub.P ( 1.3) EQU (L.sub.B +x.sub.PL -x.sub.PR)/f=L.sub.B /z.sub.P ( 1.4)
Consequently, the coordinate (x.sub.P, y.sub.P, z.sub.P) of the point P on the object can be determined from: EQU x.sub.P =L.sub.B .multidot.{(x.sub.PL +x.sub.PR)/2}/(L.sub.B +x.sub.PL -x.sub.PR) (2.1) EQU y.sub.P =L.sub.B .multidot.{(y.sub.PL +y.sub.PR)/2}/(L.sub.B +x.sub.PL -x.sub.PR) (2.2) EQU z.sub.P =L.sub.B .multidot.f/(L.sub.B +x.sub.PL -x.sub.PR) (2.3)
On the other hand, if the optical axes of the right- and left-side cameras are inclined to the baseline B by a certain angle .theta., there stand the following relationships: EQU (x.sub.PL +L.sub.B /2)/z.sub.PL =(x.sub.P +L.sub.B /2)/z.sub.P( 3.1) EQU (x.sub.PR -L.sub.B /2)/z.sub.PR =(x.sub.P -L.sub.B /2)/z.sub.P( 3.2) EQU y.sub.PL /z.sub.PL =y.sub.PR /z.sub.PR =y.sub.P /z.sub.P ( 3.3) EQU L.sub.B /z.sub.P ={(x.sub.PL +L.sub.B /2-(z.sub.PL /z.sub.PR)(x.sub.PR -L.sub.B /2)}/z.sub.PL ( 3.4)
where; .vertline.x.sub.PR .vertline..gtoreq..vertline.x.sub.PL .vertline. EQU L.sub.B /z.sub.P ={-(x.sub.PR -L.sub.B /2)+(z.sub.PR /z.sub.PL)(x.sub.PL L.sub.B /2)}/z.sub.PR ( 3.5)
where; .vertline.x.sub.PR .vertline.&lt;.vertline.x.sub.PL .vertline. EQU z.sub.PR =(x.sub.PR -L.sub.B /2).multidot.tan(.theta.)+f.multidot.cos(.theta.) (3.6) EQU z.sub.PL =-(x.sub.PL +L.sub.B /2).multidot.tan(.theta.)+f.multidot.cos(.theta.) (3.7)
Consequently the coordinate (x.sub.P, y.sub.P, z.sub.P) of the point P on the object can be determined from the foregoing relations (3.1) to (3.7).
The above-explained trigonometry allows to determine the distance to the object from two images obtained by a multi-eye image taking system consisting of right- and left-side image taking systems. The principle of trigonometry is based on a premise that the projection point P.sub.R on the sensor plane A.sub.SR of the right-side camera and that P.sub.L on the sensor plane A.sub.SL of the left-side camera are obtained from a same single point P. It is therefore necessary to extract the projection point P.sub.R on the sensor plane A.sub.SR of the right-side camera, corresponding to the projection point P.sub.L on the sensor plane A.sub.SL of the left-side camera, and, in a result for obtaining the distance information with the multi-eye image taking system, the method of extracting the corresponding points becomes critical. Representative examples of such method include the template matching method already employed in manufacturing sites, and the cooperative algorithm based on visual processing:
(1) Template matching method:
The template matching method determines the corresponding point by conceiving a template, surrounding an arbitrary point of the left image formed on the sensor plane A.sub.SL of the left-side camera, and comparing the similarity of the right image formed on the sensor plane A.sub.SR of the right-side camera, with respect to the image in the template. The similarity can be compared with the following two method; (a) SSDA (Sequential similarity detection algorithm):
In this SSDA method, the difference between the pixel value E.sub.L of the image in the template in the left image and the pixel value E.sub.R in the right image to be searched is added up, as shown in the equation (4.1), for all the pixels on the epipolar line L.sub.eL of the left image and all the pixels on the epipolar line L.sub.eR of the right image, and the coordinate of the corresponding point is obtained where the calculated sum E(x, y) becomes minimum. ##EQU1##
In this SSDA method, if the sum of the difference of the pixel values in the course of calculation becomes larger than the minimum value obtained in the already-completed calculations for other coordinates, the calculation under way can be terminated and shifted to the next coordinate value, and it is therefore possible to shorten the calculating time, by thus dispensing with the unnecessary calculations.
(b) Correlation method:
The correlation method calculates the correlational value .rho.(x, y) between the pixel values E.sub.L of the image in the template in the left image and the pixel values E.sub.R in the right image to be searched, as shown in the following equation (4.2), and determines the coordinate of the corresponding point where the calculated correlation .rho.(x, y) becomes largest. The normalized correlation in the equation (4.2) provides a maximum value "1". ##EQU2##
(2) Cooperation Method (Cooperative algorithm)
The cooperative algorithm, proposed by David Marr, is to obtain a parallax line based on the following three rules (D. Marr, Vision: A Computational Investigation into the Human Representation and Processing of Visual Information, W. H. Freeman & Co., San Francisco, Calif., 1982):
Rule 1 (Matchability): A black point can be matched with only a black point; PA1 Rule 2 (Unity): A black point in an image can almost always be matched only with a black point in the other image; and PA1 Rule 3 (Continuity): Parallax of the matching points smoothly varies over an almost entire range.
According to Marr, a device for extracting the corresponding points, exploiting the cooperative algorithm, consists of a network in which a multitude of processing units are connected in parallel and mutual manner, and a small processing unit is placed at each crossing point or nodal point shown in FIG. 21A. If a nodal point represents a proper pair of black points, the processing unit at said nodal point eventually has a value "1", but, if it represents an improper pair (an erroneous (incorrect) target), the processing unit has a value "0".
The rule 2 allows only single correspondence along the horizontal or vertical line. Thus, all the processing units provided along each horizontal or vertical line are made to suppress one another. This is based on a principle that the competition along each line causes only one processing unit to survive as "1" and all other units to have a value "0", thereby satisfying the rule 2.
As the proper pairs tend to be arranged along the broken line according to the rule 3, an excitatory coupling is introduced between the processing units aligned in this direction. This provides each local processing unit with a structure as shown in FIG. 21B. More specifically, as inhibitory coupling is given to the processing units arranged along the vertical line 102 and the horizontal line 101 corresponding to the line of sight (visual axis) of both eyes, and an excitatory coupling is given to the processing units arranged along the diagonal line 103 corresponding to a given parallax. This algorithm can be expanded to the case of a two-dimensional image, and, in such case, the inhibitory couplings remain unchanged while the excitatory coupling is given to a minute two-dimensional region 104 corresponding to a constant parallax, as shown in FIG. 21C.
In such device for extracting the corresponding points, the right and left images are taken at first, then the network of the device is added to a load with "1" at all the points where two black points are matched (including the erroneous targets), and with "0" at all other points, and the network is then made to run. Each processing unit sums up "1" in the excitatory vicinity and "1" in the inhibitory vicinity, and, after applying a suitable weight to either sum, calculates the difference of the sums. The processing unit is set at a value "1" or "0", respectively if the obtained result exceeds a certain threshold value or not. This algorithm can be represented by the following repeating relation equation (5): ##EQU3## where C.sup.t.sub.x,y,d represents the state of a cell, corresponding to a position (x,y), a parallax d and a Time t, in the network shown in FIG. 21A; S (x,y,d) represents a local excitatory vicinity; O(x,y,d) represents an inhibitory vicinity; .epsilon. is an inhibitory constant; and c is a threshold function. The initial state C.degree. contains all the possible pairs, including the erroneous targets, within a predetermined parallax range, and such pairs are added in each repeating cycle. (Such operation is not essential but causes faster converging of the algorithm.)
In the following there will be given a detailed explanation on the method of extracting corresponding points in plural images, by forming a parallax image plane from two epipolar lines respectively extracted from two binary images, and determining a parallax line by repeating local processing involving excitatory and inhibitory couplings on said parallax image, based on the real pixels of said two binary images, thereby extracting the corresponding points thereof.
At first there will be given an explanation on the parallax line, with reference to FIGS. 22 and 23. When the optical axes of the right- and left-side cameras are perpendicular to the baseline, there is obtained a set of epipolar lines L.sub.eL, L.sub.eR by the projection of the object, as shown in FIG. 22. Then, as shown in FIG. 23, thus obtained left epipolar line L.sub.eL is placed horizontally, with its pixel a.sub.1L at the left and pixel a.sub.SL at the right, while the right epipolar line L.sub.eR is placed vertically, with its pixel a.sub.1R at the bottom and pixel a.sub.5R at the top, and the crossing points b.sub.1 to b.sub.5 of the mutually corresponding pixels (for example pixels a.sub.1L and a.sub.1R) of the epipolar lines L.sub.eL, L.sub.eR. A line connecting said crossing points b.sub.1 to b.sub.5 is called "parallax line 114", which becomes, in case of a constant parallax, a straight line with an inclination of 45.degree.. Thus, the variation in the parallax between the left- and right-side images can be clarified by the parallax line 114. Also a plane defined by the two epipolar lines L.sub.eL, L.sub.eR is called a "parallax image plane" 113.
In the following there will be explained the relation between the parallax line and the distance, with reference to FIGS. 24A to 28B. For points a.sub.21 to a.sub.25, shown in FIG. 24A, of a constant parallax, positioned distant from the right- and left-side cameras, a parallax image plane 121 and a parallax line 131 can be similarly determined as shown in FIG. 24B. Also points a.sub.31 to a.sub.34, shown in FIG. 25A, of a constant parallax, positioned closer to the cameras than the above-mentioned points a.sub.21 to a.sub.25, provide similarly a parallax image plane 122 and a parallax line 132 as shown in FIG. 25B. Furthermore, points a.sub.41 to a.sub.43 of a constant parallax, positioned still closer to the cameras than the above-mentioned points a.sub.31 to a.sub.34 as shown in FIG. 26A similarly provide a parallax image plane 123 and a parallax line 133 as shown in FIG. 26B. Furthermore, points a.sub.51, a.sub.52 of a constant parallax, positioned still closer to the cameras than the above-mentioned points a.sub.41 to a.sub.43 as shown in FIG. 27A similarly provide a parallax image plane 124 and a parallax line 134 as shown in FIG. 27B.
The foregoing indicate, if the optical axes of the right- and left-side cameras are perpendicular to the baseline, that:
(1) When the points of a constant parallax are located at the infinite distance from the cameras, the obtained parallax line becomes a straight line with an inclination of 45.degree., equalling bisecting the parallax image plane;
(2) As the points of a constant parallax come closer to the cameras, the obtained parallax line becomes a straight line with an inclination of 45.degree., positioned closer to the lower right corner of the parallax image plane.
Furthermore, points a.sub.61 to a.sub.66 as shown in FIG. 28A provide, in a similar manner, a parallax image plane 125 and a parallax line 135 as shown in FIG. 28B. The obtained parallax line 135 proceeds, from the lower left corner of the parallax image plane 125, toward the upper right corner along the parallax line 131 shown in FIG. 24B, then shifts to the parallax line 132 shown in FIG. 25B, and again proceeds toward the upper right corner along the parallax line 131 in FIG. 24B. Consequently, for an object with irregular surface, there can be obtained a parallax line corresponding to the surface irregularity.
The foregoing results indicate that the distance to the object can be determined from the coordinates of the parallax line, if the frigonometrically determined distance data are retained in the coordinates of the parallax image plane.
In the following there will be explained, with reference to a flow chart shown in FIG. 29, an example of corresponding point extraction by the cooperative algorithm from extremely similar two binary images, for example Julesz's random dot stereogram (cf. David Marr, translated by Inui et al., Vision (Computation Theory of Vision and In-brain Representation), Sangyo Tosho).
After two binary images involving parallax, such as a random dot stereogram, are entered by a multi-eye image taking system (step S1), an arbitrary set of epipolar lines L.sub.eL, L.sub.eR are extracted from said two binary images (step S2). Then the extracted epipolar lines L.sub.eL, L.sub.eR are arranged as shown in FIG. 30 (step S3). More specifically, the left epipolar line L.sub.eL is placed horizontally, with its left-hand end 141.sub.L at the left and its right-hand end 141.sub.R at the right, while the right epipolar line L.sub.eR is placed vertically, with its left- hand end 142.sub.L at the bottom and its right-hand end 142.sub.R at the top. Subsequently black points are generated on all the crossing points, on the parallax image plane 143, of "black" pixels on the left epipolar line L.sub.eL and those on the right epipolar line L.sub.eR, whereby an initial image frame 144, representing the initial value of the parallax image plane 143, is prepared as shown in FIG. 31 (step S4). Then, each black point in thus prepared initial image frame 144 is subjected to a local processing, involving excitatory and inhibitory couplings, based on the real pixels (step $5). In this operation, the excitatory coupling based on the real pixel is applied to the crossing points, present in an ellipse 150 as shown in FIG. 32, having its center at an arbitrary black point Q in the initial image frame 144, a longer axis 151 along a line of 45.degree. toward upper right, and a shorter axis 152 along a line of -45.degree. toward lower right. Also the inhibitory coupling based on the real pixel is applied to the crossing points, present along the horizontal or vertical direction with respect to the black point Q. Then, each crossing point in the initial image frame 144, subjected to the local processing involving the excitatory and inhibitory coupling based on the real pixel, is subjected to a predetermined processing, for example utilizing a binary threshold function as shown in FIG. 33 (corresponding to the threshold function G in the foregoing equation (5)), whereby a new parallax image frame is prepared (step S6). Then there is discriminated whether the processes of the steps S5 and S6 have been executed for a predetermined number of times (step S7), and, if not, the processes of the steps S5 and S6 are repeated, employing the new parallax image frame, prepared in the foregoing step S6, as the initial image frame. As a result, a parallax line is finally obtained on the parallax image frame prepared in the step S6, and the corresponding points can be extracted, utilizing said parallax line.
Instead of the discrimination in the step S7, the processes of the steps S5 and S6 may be repeated, utilizing the new parallax image frame prepared in the preceding step S6 as the initial image frame, until the value of each crossing point therein converges. Also on the two binary images taken in the step S1, the parallax line can be obtained on another set of epipolar lines L.sub.eL, L.sub.eR by repeating the processes of the step S2 to S7.
Now, there will be given a detailed explanation, with reference to FIG. 34, on the method of extraction of a set of epipolar lines in the step S2 in FIG. 29.
Referring to FIG. 34, the lenses of the two (left-side and right-side) cameras have respective centers at O.sub.L, O.sub.R ; and the coordinate on the sensor plane A.sub.SL of the left-side camera is represented by (x.sub.L, y.sub.L, z.sub.L); while that on the sensor plane A.sub.SR of the right-side camera is represented by (x.sub.R, y.sub.R, z.sub.R). The axes z.sub.L, z.sub.R are so selected as to coincide with the respective optical axes.
By representing the unit vectors of the axes x.sub.L, y.sub.L, z.sub.L, x.sub.R, Y.sub.R and z.sub.R respectively by: EQU i.sub.L, j.sub.L, k.sub.L, i.sub.R, j.sub.R, k.sub.R
also the distance from the sensor plane A.sub.SL of the left-side camera to the center O.sub.L of the lens thereof by f.sub.L ; and the distance from the sensor plane A.sub.SR of the right-side camera to the center O.sub.R of the lens thereof by f.sub.R ; the vector: EQU P.sub.L
of the projection point P.sub.L of a point P of the object onto the sensor plane A.sub.SL of the left-side camera, and the vector: EQU P.sub.R
of the corresponding projection point P.sub.R on the sensor plane A.sub.SR of the right-side camera can be respectively represented by: EQU P.sub.L =O.sub.L +X.sub.L i.sub.L +Y.sub.L j.sub.L +f.sub.L k.sub.L( 6.1) EQU P.sub.R =O.sub.R +X.sub.R i.sub.R +Y.sub.R j.sub.R +f.sub.R k.sub.R( 6.2)
Also the relative position vector: EQU VL
between the projection point P.sub.L on the sensor plane A.sub.SL of the left-side camera and the relative position vector: EQU V.sub.R
between the projection point P.sub.R on the sensor plane A.sub.SR of the right-side camera can be respectively represented by: EQU V.sub.L =P.sub.L -O.sub.L ( 6.3) EQU V.sub.R =P.sub.R -O.sub.R ( 6.4)
and the relative position vector: EQU d
between the centers O.sub.L, O.sub.R of the lenses of the left- and right-side cameras (corresponding to the baseline length) can be represented as: EQU d=O.sub.L -O.sub.R ( 6.5)(6.5)
Furthermore the unit normal vector: EQU n.sub.Le
to the epipolar plane A.sub.e can be represented as: ##EQU4##
By representing the unit normal vector to the sensor plane A.sub.SR of the right-side camera by: EQU k.sub.R
The right epipolar line L.sub.eR is perpendicular to n.sub.Le and k.sub.R, so that the unit vector: EQU e.sub.Re
in this direction can be represented as: ##EQU5## consequently, with the vector: EQU P.sub.R
of the projection point P.sub.R on the sensor plane A.sub.SR of the right-side camera, the right epipolar line L.sub.eR can be represented as: EQU P.sub.R +.beta.e.sub.Re ( 6.9)
A similar representation stands also for the left epipolar line L.sub.eL.