1. Field of the Invention
The present invention relates to measurement of distance from a subject having an image feature, which has been extracted via image recognition, to an image sensing device.
2. Description of the Related Art
Many methods have been proposed heretofore with regard to measuring the distance from a subject to an image sensing device. Examples of such measurement methods according to the prior art will be described. One such method, which is used in autofocus cameras and the like, involves forming an image on an element such as a distance-measuring element using a two-eye lens or the like in the optical system. Further, with a depth-from-focus method, the focus is moved constantly and the distance that prevails when the video is sharpest on the observation screen is found as the estimated distance. With the depth-from-defocus method, on the other hand, the extent of image defocus is analyzed and the estimated distance is found based on the relationship between the amount of defocus and distance.
A ray-tracing method using a microlens array or the like illustrated in Non-Patent Document 1 finds estimated distance by analyzing angle information of captured light rays.
In the field of image recognition in particular, a technique using multiple cameras also in the distance measurement methods set forth above is often employed as a method of acquiring three-dimensional depth information. According to such a method, a luminance gradient feature such as a salient point or edge is acquired using an image feature, namely an affine invariant, such as a Harris affine or SIFT. By making the acquired luminance gradient feature, such as the salient point or edge, information that is independent of point-of-view position, the correspondence with the image features at the multiple cameras is acquired and the distance between the image sensing device and the subject of image capture is found by the principle of triangulation.
[Patent Document 1] Japanese Patent No. 2963990
[Non-Patent Document 1] Light Field Photography with a Hand-held Plenoptic Camera/Ren Ng, Marc Levoy, Mathieu Bredif, Gene Duval, Mark Horowitz, Pat Hanrahan/Stanford University, Duval Design/SIGGRAPH 2005
However, these methods have a number of problems, described below.
First, the problem with distance measurement using multiple cameras is an increase in cost that accompanies an increase in the number of image sensing devices. With regard to the large number of products sold, such as surveillance cameras and handycams, requiring multiple cameras to be installed is a major disadvantage for such sales.
Further, using special-purpose hardware results in equipment of a larger size. In the case of multiple cameras, base length is directly linked to measurement accuracy and therefore a certain amount of base length is required. Such an increase in the size of the equipment is itself a disadvantage.
Another problem is an increase in weight that accompanies an increase in size. In order to implement this method, at least two cameras are required. Consequently, in a case where it is desired to hold down cost per unit, as in surveillance and security, or at installation locations where a reduction in weight is a challenge, such as at the end of a robot arm, or in handycams and digital cameras, an increase in weight due to multiple units is a major problem.
Accordingly, a small-size, light-weight, inexpensive three-dimensional depth-measuring method using monocular vision has been studied. However, the conventional method that relies upon monocular vision involves the problems set forth below.
First, the phase-difference method used in autofocus cameras and the like requires a distance-measuring element and a distance-measuring optical system, etc., in addition to a CMOS for image capture. Further, with the phase-difference method, distance measurement can only be performed at several points to tens of points on the observed image. As a consequence, it is difficult to obtain a distance image.
A lens focal-point method requires movement of focus and is accompanied by mechanical drive of a focusing lens. Acquiring a distance image, therefore, takes time. Further, a defocus analysis method uses the relationship between blur, which is produced by a telecentric optical system, and image formation. This means that there is little degree of freedom in terms of lens design. A ray-tracing method using a microlens array or the like is such that a decline in the spatial resolution of an in-focus image occurs to the extent that angle information of the captured light is acquired. Although a distance image and an in-focus image are obtained by a method using a patterned diaphragm or the like described in Patent Document 1, this method uses a telecentric optical system and is implemented by a diaphragm using a pin-hole aperture. Consequently, a problem is a decline in amount of light.