Field of the Invention
The present invention relates to an image processing apparatus configured to estimate the three-dimensional position of an object from images captured by a plurality of cameras and a method therefore.
Description of the Related Art
There have been proposed methods for estimating the three-dimensional position of an object by capturing images of the object with a plurality of cameras having overlapping fields of view in a monitoring camera system. In the methods, images of a subject are captured by cameras located in positions that are known, and the three-dimensional position of the subject is estimated based on the positions of the subject on the camera images by use of the stereoscopic principle. At this time, there arises a situation that a false three-dimensional position is estimated as a virtual image. Hereinafter, a “position in which a subject actually does not exist” is sometimes referred to as a “false three-dimensional position,” and a situation in which “it is estimated that a subject exists in the three-dimensional position” is sometimes referred to as a situation in which “a false three-dimensional position is estimated as a virtual image.”
For example, there may be a case where a camera 1 captures a human body B while a camera 2 captures human bodies A and B as illustrated in FIG. 2. In this case, while the correct position of the human body B is estimated, an intersection point of a straight line connecting an optical center C1 of the camera 1 to the human body B and a straight line connecting an optical center C2 of the camera 2 to the human body A is estimated as the position of a virtual image V.
A solution to the foregoing situation is discussed in Japanese Patent Application Laid-Open No. 2010-063001. Specifically, three-dimensional movement trajectories of human bodies are acquired, and fragments of the three-dimensional movement trajectories are combined together to calculate complete movement trajectories of the respective human bodies. By combining together the trajectories of a predetermined length of time, a virtual image is eliminated. Hereinafter, to prove that a subject does not exist in a false three-dimensional position is sometimes referred to as “to eliminate a virtual image.”
Further, a virtual image occurs due to ambiguous association of human bodies between cameras. Thus, in the stereoscopy, a method is often employed in which image information about colors, textures, etc. of object areas is compared between cameras and whether the object areas correspond to the same object is determined to eliminate a virtual image (for example, refer to Hiroshi Hattori, “Stereo Vision Technology for Automotive Applications,” Journal of Society of Automotive Engineers of Japan, 63(2), 89-92, 2009-Feb. 2001 (hereinafter, “Hattori”).
In the method discussed in Japanese Patent Application Laid-Open No. 2010-063001, a large number of frames are used to eliminate a virtual image. Thus, the method is not applicable to a situation in which promptness is required. Further, in the method discussed in Hattori, since image information such as colors, textures, etc. of object areas is used to distinguish between objects, it requires a large amount of calculation in image processing and a large image transfer bandwidth. Further, the appearance and color of an object may differ depending on the direction and distance of image capturing, and in this case it is difficult to identify objects as the same object based on images of the objects that are captured from different directions.