1. Field of the Invention
The present invention relates to a three-dimensional information acquisition apparatus capable of acquiring a depth value by projecting a pattern on a object and obtaining the correspondence between a light receiving pattern and a light projection pattern in order to measure a three-dimensional shape, a projection pattern in such three-dimensional information acquisition, and a three-dimensional information acquisition method.
2. Description of the Related Art
As conventional methods for measuring the shape of a three-dimensional object, there are roughly two types of methods. One is a method based on measurement of a light propagation time, and the other one is a method which utilizes the principle of triangulation. The former has no blind spot and is an ideal method in principle. Under the present situation, however, there is a problem in measurement time and accuracy, and the latter triangulation method is mainly used.
As the method utilizing triangulation, there are an active method and a passive stereo method.
The passive stereo method corresponds features in the image obtained from two cameras provided at different positions with each other, and obtains a distance to an object based on the principle of triangulation from the previously measured positional relationship between the two cameras. This method has drawbacks that the correspondence of features in the images is difficult and the shape of an object having no texture cannot be obtained.
On the other hand, as the active method utilizing the principle of triangulation, there is a light projection method which measures a shape by substituting a light source for one of the two cameras and observing an image of the light source on a surface of an object by the other camera which is set as a view point. This light projection method can be further classified into a spot light projection method, a slit light projection method and a pattern light projection method.
In the spot light projection method, only one point of an object can be measured by one input of an image.
In the slit light projection method, although one line of an object can be measured by one input of an image, input of an image must be repeated many times while deflecting the projected light in order to measure a shape of a gamut of the object, which takes time for input.
The pattern light projection method projects a two-dimensional pattern such as a stripe pattern or a grating pattern onto an object, and has the merit that measurement can be carried out in a short time since a number of times of input of the pattern projection image is small.
This pattern light projection method is also called spatial coding, and this is further classified into pattern form coding and gradation coding.
As the former pattern form coding, one based on distribution of an opening width of the slit and that based on utilization of an M sequence code are proposed, but they have a problem in measurement density and measurement stability. It is said that they serve no practical use.
On the other hand, as the latter gradation coding, there is one based on light and shade and one based on colors.
At first, as to coding based on light and shade, there is well known a time-series spatial coding method using a binary pattern, which performs projection while doubly varying a lighting pitch of a pattern. This method has an excellent characteristic. For example, a coding error due to displacement or noise can be suppressed to ±1. In order to obtain the same resolution as N slight light rays, projecting binary patterns for log2N times can suffice. For example, the same resolution as 128 slit images can be realized by projecting seven binary patterns.
In the time-series coding using the binary pattern, however, sufficient measurement cannot be taken when the camera is not fixed on a tripod but held in the hand or when an object such as a human being or an animal cannot stay still. That is because the shooting cannot be finished within a time which can allow a camera shake or blurring of an object if a number of times of projection is large. Further, there is also a restriction in flash charging time or the quantity of light emission, and the number of times of projection which can satisfy such a restriction is not enough.
Therefore, in order to reduce a number of times of projection of a pattern, changing a binary pattern to a multi-value pattern can be considered. For example, this is proposed in Japan Society for Precision Engineering Journal, vol. 62, No. 6, pp. 830–834, 1996. In this proposition, a difference in brightness in respective pixels of an image taken with flash and an image taken without flash is divided by a number of gradations, and the multi-gradation is judged based on which section a brightness value at the time of creation is included in.
Therefore, when there is non-linearity such as a gamma characteristic in an acquired image, a decoding error is apt to occur in an area where a difference in reflection brightness is small, and stable measurement under regular illumination is difficult. Further, the measurement result is affected by, e.g., the color of the surface of the object. In order to solve these problems, reference projection or the like is proposed in the above cited reference, but this is not realistic because the number of times of projection is increased. As described above, coding using the multi-gradation of brightness is easily affected by fluctuations in brightness, and hence setting an appropriate threshold value is difficult.
On the other hand, for example, Institute of Electronics, Information and Communication Engineers Journal, vol. J61-D, No. 6, pp. 411–418, 1978 proposes coding using colors. FIG. 1 shows an example of a stripe pattern coded by using colors of R, G and B. It is designed to reduce mixed color resulting from diffusion of light by providing an area of black, which is on a minimum brightness level, between R, G and B.
A flow of processing in the conventional color pattern projection method using colors is as follows.
At first, a coded color pattern such as shown in FIG. 1 is projected onto an object, and image data (Pr, Pg, Pb) obtained by shooting this projection pattern are obtained.
Subsequently, a stripe structure (local maximum and local minimum positions of Pr, Pg and Pb) is detected from the image data. Then, a color at the time of emitting the stripe is specified with respect to each stripe by using a component which is maximum in Pr, Pg and Pb.
Thereafter, a code sequence (alignment of colors) of the received stripe is written, and each code number of the received code sequence and a code number of the code sequence when emitting the pattern, including the arrangement relationship in the code sequence, are matched. As a result, the pattern emission position and the reflected light reception position can be uniquely associated with each other, and the principle of the triangulation is applied to this association, thereby calculating a depth value.
Then, a three-dimensional image is generated from the obtained depth value and the picturized two-dimensional image information.
The conventional color pattern projection method using colors carries out the above-described processing.
In the prior art color pattern projection method using colors, however, it is difficult to measure a three-dimensional shape of an object which has a high-saturation color in particular. This will now be described with reference to FIG. 1.
For example, although one local maximum signal can be obtained in a brightness value of an R component with respect to an object having such a surface reflectivity characteristic as that a reflection signal of the R component is strong (that is, a high-saturation red object such as an apple), the brightness values of a G component and a B component are almost on the noise level. Under such a circumstance, even the stripe structures of the G component and the B component cannot be extracted, and the stripe cannot be specified. That is, decoding is difficult with respect to an object having such a surface reflectivity characteristic as that even one of R, G and B components has a reflection signal which is close to zero. In other words, decoding can be successfully performed with respect to a surface of an object which has a white color or a low-saturation color, but judgment based on colors obtained from combinations of R, G and B becomes difficult with respect to an object having such a surface reflectivity characteristic as that even one of reflection signals of the R, G and B components has a value on the noise level close to zero.
Incidentally, although the above has described the example of the stripe pattern, the same is true of a grating pattern or the like.