Thus far, in self-position estimation in which the position of a camera on the real space is estimated on the basis of an image photographed by the camera, the self-position of the camera has been estimated by using a gradient, a feature point, etc. serving as an index included in the image (e.g., see Patent Literature 1).
Hence, in a case where there is no feature point or the like in the photographed image, such as in a case where the subject is a white wall, in principle it is impossible to continuously estimate the self-position of the camera.
In practice, there are few situations such as being surrounded by a white wall on all sides, and in many cases, even in a situation where there is no feature point in the front visual field, there are a sufficient number of feature points to perform self-position estimation in the rear, the ceiling, etc.
However, in a case where, for example, a user moves in a room while wearing a camera such as Action Cam used for self-position estimation, when in the room there are few subjects from which a feature point can be detected, the user needs to always face a subject from which a feature point can be detected. Thus, if it is intended that a feature point be always included in the visual field, the movement of the user is restricted, and the degree of freedom is impaired.
On the other hand, in order for the user to be able to move freely and yet avoid a situation where a feature point is not detected from the image, it may be possible to perform self-position estimation using an all-round camera. In this case, a feature point or the like is detected from an image photographed by the all-round camera, and the self-position is estimated.
However, if an image of the surrounding environment is photographed using an all-round camera, although the range on the space in which a feature point can be detected is expanded, the spatial resolution is reduced because the angle of view of the camera is expanded, and consequently the self-position cannot be estimated with sufficient accuracy.
Thus, in order to make it possible to estimate the self-position with sufficient accuracy without restricting the movement of the user, a technology that performs self-position estimation using a plurality of cameras is proposed. In this technology, a plurality of cameras are arranged, and the plurality of cameras function as one wide-angle camera.
If a plurality of cameras are used in this way, the area not observed by the camera, that is, the dead angle can be reduced, and therefore a situation where a feature point is not detected from the image can be avoided; thus, the self-position can be estimated without restricting the movement of the user. Further, since the surrounding environment is photographed by a plurality of cameras, the spatial resolution of the image is not reduced, and the self-position can be estimated with sufficient accuracy.