Estimation of the posture of a person based on image data from a captured moving image has been an active area of research in recent years. Estimating behavior using time series information of an estimated posture allows a posture estimation apparatus to determine the behavior of the person by computer analysis based on moving images, and can thus perform behavior analysis that does not depend on human effort. Examples of behavior analysis include detection of unexpected behavior on streets, analysis of purchasing behavior in stores, support for the enhancement of work efficiency in factories, and form coaching in sports.
It is desirable that this kind of posture estimation is performed without attaching an apparatus such as a direction sensor to a person. The reason is that if posture estimation is performed by attaching an apparatus to a person, it is difficult for a random person to be taken as an estimation target, and the cost will increase if there are a large number of estimation targets.
Therefore, as posture estimation that takes a random person as a target, as disclosed, for example, in PTL 1, technology has been proposed that estimates the orientation of the body of a person on the basis of a video obtained by photographing the person.
The technology disclosed in PTL 1 (hereunder, referred to as “related art”) estimates candidates for a posture that can be assumed next (hereunder, referred to as “next candidate posture”) based on the posture that was estimated the previous time (hereunder, referred to as “previous estimation posture”). The related art compares the position of each part of the next candidate posture with an image of the part in a photographed image, and retrieves a candidate posture with the highest correlation.
However, depending on the posture, a certain part of a person is hidden by another part, and a portion or all of the certain part cannot be recognized on an image (hereunder, such a state is referred to as “concealed”). According to the related art, if there is a part that is being concealed (hereunder, referred to as “concealed part”) in this manner, in some cases the outlines of different postures may resemble each other, which makes it impossible to perform correct posture estimation in some cases.
Therefore, according to the related art, an area (number of pixels) that each part occupies in an image is determined with respect to the previous estimation posture, and a part whose area is less than or equal to a threshold is extracted as a concealed part. Further, according to the related art, if there is a concealed part in the previous estimation posture, the degree of freedom with respect to the posture of the concealed part is set higher than that of a part that is not concealed, and the degree of freedom with respect to the next candidate postures is expanded to increase the number of candidate postures. Therefore, according to the related art, even in a case where the previous estimation posture was erroneous due to the difficulty of estimating the position of a concealed part (the lowness of the estimation accuracy), posture estimation can be performed that takes into account a fact that the next candidate postures include a correct posture.