Field
The present invention relates to a technology for estimating information on a sensing area from an image of a camera.
Related Art
Nowadays, there is increasing a need for estimating a person's flow line, a floor surface, and a shape of a room or a passage based on a result of detecting the person from the image photographed with the camera. This kind of technology is applied to structure recognition of a sensing area in a monitoring camera or an image sensor, or has been incorporated in a household electrical appliance.
For example, in a technology disclosed in Unexamined Japanese Patent Publication No. 2013-24534, whether the person moves is detected using the image of the camera, a height of the person is estimated when the person moves, an obstacle is detected in a room from a movement history of the person, and a detection result of the obstacle is used in air conditioning control of an air conditioning apparatus (air conditioner). Unexamined Japanese Patent Publication No. 2002-197463 discloses a method for determining stillness and movement or a posture of the person by monitoring change in coordinate of a top head among plural frame images. Unexamined Japanese Patent Publication No. 2013-37406 discloses a method for estimating a height using a detection result of a head in an image.
(1) A technique of calculating an existence position (a position on a real world) of the person by estimating a depth distance (a distance between the camera and the person) from a size of the person in the image and (2) a technique of calculating the existence position of the person through a triangulation principle by photographing the person from two directions using two cameras are well known as a typical technique of calculating the existence position of the person from the image of the camera.
However, in technique (1), distance estimation accuracy is not stable because the depth distance is estimated based on the size of the person in the image, a detection result of the size varies largely depending on image quality, resolution, a physical constitution or a body shape of the person, and a posture. On the other hand, in technique (2), because of the necessity of two cameras, device cost increases compared with the device of a monocular camera. The images of the two cameras are separately processed, and a result is obtained by a combination of the pieces of information on the images. Therefore, the processing may become complicated.