1. Field of the Invention
The present invention relates to an image processing process for extracting three-dimensional features of an object.
It is necessary to spatially measure a distance to an obstacle or an object when a robot moves to avoid the obstacle or precisely manipulate the object.
The range (50 degrees) of sight of the conventional robots is not enough, in particular, when the robots move in a narrow environment such as in an industrial plant or in a warehouse. It is necessary to perform three-dimensional measurement recognizing the environmental conditions with a field of view comparable to the range (180 degrees) of the human eye. A fisheye lens is used for performing the measurement with a wide field of view. However, the images obtained through the fisheye lens are distorted, and therefore it is difficult to precisely process the distorted images, and a special image processing is required.
Since almost all the objects to be manipulated and the environmental conditions in an industrial plant or in a warehouse are artificial, they are constituted basically by straight lines and cylinders for ease in manufacture. Therefore, the objects are imaged as straight lines on an input screen. Thus, preprocessing of an image for precisely extracting a line segment is indispensable for movement or operations of the robots. The robots can perform operations such as avoiding an obstacle or approaching an object by using the line segment as a clue.
The function for precisely extracting a line segment from an image of a wide field of view (fisheye lens image) is indispensable for movement or operations of the robots in a narrow environment such as in an industrial plant or in a warehouse.
2. Description of the Related Art
2.1 The applicants have already invented and proposed a three-dimensional (stereoscopic) moving view (see for example, the Japanese Examined Patent No. 3-52106, or Kawakami in Kagaku Asahi, June 1987). In the kinetic stereopsis, a three-dimensional (stereoscopic) perception is obtained based on the motion parallax caused by movement, and the image obtained when moving a fisheye lens camera is processed on a sphere to perform three-dimensional (stereoscopic) measurement of a line segment, a point, a cylinder, and the like. Thus, a line segment can be three-dimensionally (stereoscopically) measured over the range of human sight (180 degrees).
FIGS. 1A and 1B are diagrams for explaining a spherical mapping. The image input through a fisheye lens is equivalent to an image obtained by the projection on a sphere, and is distorted. Therefore, an operation called a spherical mapping (which is precisely denoted as a polar transformation or a dual transformation on a sphere) is required. The spherical mapping is an operation wherein an arbitrary point P on a sphere is transformed to a great circle R (a largest circle on a sphere corresponding to an equator) having a pole thereof at the point P, as indicated in FIG. 1A. When drawing great circles R1, R2, R3, . . . respectively having their poles at points P1', P2', P3', . . . , where the points P1', P2', P3', . . . are respectively obtained by projecting points P1, P2, P3 onto the sphere, . . . which constitute a line segment L, the great circles necessarily cross at a point S, as indicated in FIG. 1B. The intersecting point S is a characteristic point having a one-to-one correspondence to the line segment L. The longer the line segment L is, the larger the number of the points in the line segment L, the larger the number of the great circles, and therefore the higher the degree of superimposition of the great circles at the point S. Thus, a line segment is extracted as a point corresponding to the line segment on a sphere, and the length of the line segment can be measured by obtaining a histogram of the degree of superimposition of the great circles at respective points. The point S corresponding to the line segment L can be expressed in geometry as "a pole having as a polar line (great circle) a projection L' of a line segment L onto a sphere".
FIG. 2 is a diagram illustrating the construction of a system for performing three-dimensional (stereoscopic) measurement by a kinetic stereopsis using a spherical mapping. When an image IMG is input from a fisheye lens built in the spherical camera 1, the contour extracting portion 2 extracts a contour, compresses information, and writes the information in a spherical mapping image memory 2a built in the contour extracting portion 2. The contour of an object can be extracted by detecting with differentiation points where the brightness in the image is maximized.
Next, the line segment extracting portion 3 extracts a line segment on the sphere (hereinafter called a great circle) by concentrating the line segment into a point in the spherical mapping process. The process for extracting a line segment is an important process in the three-dimensional (stereoscopic) measurement system, and the major portion of processing time is spent for the process. For example, to extract the line segment, the polar transformation portion 3a transforms each contour to a great circle by the spherical mapping, and writes information on each great circle into the mapping memory 3b. The address of the mapping memory 3b is given by the longitude .alpha. and latitude .beta. indicating a point P on a sphere CB as indicated in FIG. 3. Each cell designated by the address in the mapping memory 3b is constituted by, for example, a counter, and the counter is incremented by one every time a writing operation is carried out. After all the contour points are transformed to great circles by the spherical mapping, the S point detection portion 3c scans the respective cells in the mapping memory 3b, and obtains a peak position of the count values. The peak position is the pole (point S) of the line segment as explained with reference to FIG. 1B. Thus, a line segment is extracted.
Based on the "point" data extracted by the concentration of a line segment as explained above, the three-dimensional measurement thereafter is easily carried out (cf., for example, the international laid-open WO90/16037 by Morita et al.). When the "point" data is input into the line segment measurement portion 4, the three-dimensional data (orientation and distance) of a straight line portion in a screen is output. For example, the spherical camera 1 is moved in a straight direction when measuring an orientation of a line segment. The above-mentioned operation of extracting a line segment is repeated for the successively obtained images. FIG. 4 is a diagram illustrating relative locations of the line segment and the camera when the spherical camera 1 is moved in a straight direction. The poles S, S', S", . . . on the sphere, corresponding to the relatively moved line segment L, L', L" . . . , line up on a great circle. When performing the spherical mapping for the respective points, and drawing great circles R, R', R", . . . , the intersecting point Ss of the great circles is a pole of a great circle on which the poles S, S', S", . . . lie. The vector directed from the center of the sphere to the point Ss is parallel to the actual line segment L. Thus, an orientation of the line segment is obtained. Geometrically, the point Ss is a point generated by projecting a vanishing point at infinity on the line segment L onto the sphere. Namely, the orientation of the line segment is determined based on theory of the perspective projection method. The extracted poles S, S', S", . . . are respectively transformed to great circles by the spherical mapping, and the information on the great circles are written in the mapping memory. The mapping memory is scanned to obtain a peak position of the count values as a vector of an orientation of the line segment. As understood from FIG. 4, the point Ss corresponds to a group of parallel lines
The principle of measuring a depth to the object is explained below. First, a process of measuring a distance from a camera to a point P on a two-dimensional plane is explained with reference to FIG. 5A. When the camera is moved as C.sup.0, C.sup.1 C.sup.2, . . . , the direction in which the point P is viewed varies as Sk.sup.0, Sk.sup.1, Sk.sup.2, . . . . When drawing straight lines in the directions in respective timings, these lines cross at a point P. Therefore, when the intersecting point P is obtained, the distance from the initial position C.sup.0 of the camera to the point P of the object is given by the length of the line segment C.sup.0 P. An operation similar to the above is carried out on a sphere. In FIG. 5B, the plane on the right side corresponds to FIG. 5A, and placed in perpendicular to the line segment O.SIGMA.. The correspondence between the plane and the sphere is indicated by dashed lines, and the direction of the movement of the camera is indicated by V. It is assumed that the point P is viewed as P.sup.0, P.sup.1, P.sup.2, . . . , and is projected on the sphere as Sk.sup.0, Sk.sup.1, Sk.sup.2 when the camera is moved by a pitch .DELTA.x.sup.0. The point .SIGMA. is a pole of the great circle R generated by mapping the trace of the point P, and is obtained as an intersecting point of a group of great circles obtained from the points Sk.sup.0, Sk.sup.1, Sk.sup.2 by the spherical mapping. A time axis (.tau.-axis) is assumed on a quarter circle from the point .SIGMA. to the end point v of the vector V on a circle R' passing through the points v and .SIGMA., and the point .SIGMA. is assumed to correspond to .tau.=0, i.e., C.sup.0. The points the lengths (expressed by angles) on the sphere to which from the point C.sup.0 are equal to .tau.=arctan (i.eta.) (i=1, 2, . . . ) are denoted by C.sup.1, C.sup.2, . . . , where .eta.=.DELTA.x0/R0. This operation means to plot the points C.sup.0, C.sup.1, C.sup.2, . . . with a pitch of 1/R0 on the plane on the right side in FIG. 5B. As understood by making i.fwdarw..infin. in the above equation of .tau., the end point v corresponds to a point at infinity.
Next, considering that a straight line on a plane corresponds to a great circle on a sphere, the point C.sup.0 and the point Sk.sup.0, the point C.sup.1 and the point Sk.sup.1, the point C.sup.2 and the point Sk.sup.2, . . . , are connected by great circles, respectively. The great circles thus obtained cross at a point Q. Thus, the distance from the initial position C.sup.0 of the camera to the point P is given by the product of R0 and a tangent of the length of the arc C.sup.0 Q, where lengths on the sphere are expressed by angles.
Next, the above "point" data is input into the cylinder measurement portion 5, the three-dimensional data (an orientation, a distance, and a diameter) is output therefrom. As explained before, parallel lines are obtained, the cylinder and the diameter thereof can be obtained from the parallel lines, and the orientation and the distance are also obtained in the same way as the case of the line segment.
Although almost all of the environmental conditions can be measured by the straight lines and the cylinders as above in the artificial environment such as an industrial plant, raw "point" data may be input into the point measurement portion 6 (shown in FIG. 2) to perform three-dimensional measurement the location of each point in the space when it is required to recognize environmental conditions other than the above.
2.2 The above three-dimensional measurement system contains the following problems. One of the problems is to increase the speed of the operation and to reduce the size of the system, and the other is to suppress interference.
As explained above, the major portion of the processing time is spent for the process of extracting a line segment in the three-dimensional measurement using the fisheye lens. The major reason is that each point of an input image is transformed by the spherical mapping to extract the line segment, i.e., each point in the input image is transformed to a great circle on a sphere increasing the dimension. When the size of the input image is assumed to be N.times.N, the spherical mapping is required to transform each point to a great circle having a length N. The amount of processing to N.sup.3, which is N times the data amount N.sup.2 in the input image, is required to transform each point to a great circle having a length N, and this makes increasing the speed of the operation difficult. Although parallel provision of hardware may increase the speed, this increases the hardware size. Thus, both the increase in speed and reduction of hardware size are required at the same time.
In the case where another great circle exists in an orientation near the direction of the great circle corresponding to the object under measurement, in the operation of extracting a line segment by the spherical mapping, the accuracy of the extracted line segment is deteriorated due to interference of the two great circles. For example, in the case where ridge-lines AL and BL of solid bodies A and B cross at an angle near 180.degree. as indicated in FIG. 6, these ridge-lines interfere with each other, and are detected as an obscure line. Namely, precise line detection is impossible. To perform precise measurement of an object in a complicated environment, the extraction of a line segment with suppressed interference is required together with the increase in speed and reduction of size.
2.3 It is necessary to recognize three-dimensional conditions of the environment of a robot when the robot moves or controls an automatic operation thereof. There are two methods for recognizing three-dimensional conditions of the environment. One method is the "binocular stereopsis" whereby the depth is measured in accordance with the principle of trigonometrical survey using the binocular parallax between the right and left eyes, and the other is the "kinetic stereopsis" whereby the three-dimensional (stereoscopic) perception is obtained using the motion parallax generated by moving of a viewer. The "binocular stereopsis" has been developed for years. Although it is necessary to extract corresponding portions in the images obtained by right and left eyes, the extraction of the corresponding portions is difficult by the conventional technique.
FIG. 7 is a diagram for explaining the principle of the "binocular stereopsis". In FIG. 7, it is assumed that objects are placed at the two points A and B on a plane. Although only the directions toward the objects A and B from the two eyes can be recognized, respectively, the depth to each of the objects A and B is recognized as an intersecting point at which the directions of the two eyes cross. Namely, as indicated in FIG. 8, the depth D is obtained by the following equation, EQU D=d/(tan .rho..sub.L +tan .rho..sub.R)
where d denotes a distance between two eyes, .rho..sub.L and .rho..sub.R denote angles between the direction perpendicular to the line on which the two eyes lie, and the lines of sight by the left and right eyes, respectively.
However, the directions of the two eyes cross at other points. Namely, the direction of the left eye seeing the object A crosses with the direction of the right eye seeing the object B at the point .beta.. The point .beta. is an untrue (false) point. Similarly, an untrue point .alpha. may be generated. These untrue points must be eliminated in the "binocular stereopsis".
Since the capability of recognizing a shape of an object is developed in a human cerebrum, a human being can easily eliminate the untrue points. However, in the conventional technique of the "binocular stereopsis", it is difficult to precisely recognize the corresponding points, and therefore the development of the technique of precisely recognizing the corresponding points is required.