As a method of detecting a position of a target object in a captured image and estimating the three-dimensional position of the target object based on the detected position, a technique using a particle filter is known. In the particle filter, a tracking target object is expressed as a discrete probability density by a plurality of provisional groups each having the quantity of state and a likelihood. Then, by propagating the tracking target object by using a state transition model, a tracking process is performed in which the effect of a variation in the motion or of a noise is suppressed.
In the method using the particle filter, three-dimensional coordinates of a local feature are calculated based on a stereo image, and three-dimensional coordinate sample points forming a provisional group are set in a vicinity of the three-dimensional coordinates. Then, by evaluating two-dimensional coordinate sample points acquired by projecting the three-dimensional coordinate sample points on the stereo image as the provisional group, the three-dimensional position of the local feature is estimated.
In addition, a method is also disclosed in which a head of a person is assumed as an ellipsoid model having a predetermined size by using a particle filter that generates three-dimensional sample points at three-dimensional positions as a provisional group. In the disclosure, the size acquired by projecting the ellipsoid onto each captured image is set as the size of a search window, and the likelihood that represents a probability of the existence of the head of a person within the search window is calculated as a provisional likelihood. Then, the three-dimensional position of the head of the person is estimated based on the provisional likelihood. Furthermore, a search method using a search window is known.
However, according to the method that uses only the particle filter, it is difficult to acquire the size of a search window, and it is difficult to apply a search method. In addition, according to a method that uses an ellipsoid model, since the size of the search window is determined based on a specific person, it is difficult to absorb an individual difference of a target object.