Typically, position estimation devices are known that estimate the three-dimensional position of a person in the real space on the basis of images of that person captured with the use of a visible camera as well as on the basis of an estimated distance to that person that is estimated with the use of a distance sensor.
Among such position estimation devices, some position estimation devices estimate the three-dimensional position of a person by detecting the face area of that person captured in images and by measuring the distance to the person with the use of a distance sensor that senses the direction corresponding to the face area.
However, the measurable range of a distance sensor is limited. Therefore, depending on the position of a person, it may not be possible to accurately estimate the three-dimensional position of that person.
Meanwhile, alternatively, some position estimation devices estimate the three-dimensional position of a person in the real space on the basis of images of that person captured with the use of two visible cameras (i.e., captured with the use of a stereo camera).
In such a position estimation device, based on the position of the face area of a person captured in each image, the three-dimensional position of that person is estimated by means of triangulation.
However, depending on the position of a person, there are times when that person is captured by only one of the two cameras. Hence, it may not be possible to accurately estimate the three-dimensional position of that person.