Human posture estimation based on image data from a captured video sequence has been an active area of research in recent years. This is because being able to determine human behavior based on videos through computer analysis would make behavior analysis, which is performed in various fields, possible without requiring human effort. Examples of behavior analysis include abnormal behavior detection on the streets, purchasing behavior analysis in stores, factory streamlining support, and form coaching in sports.
In this respect, NPL 1, for example, discloses a technique for estimating the posture state of a person based on image data captured with a monocular camera. In the technique disclosed in NPL 1 (hereinafter referred to as “related art”), the silhouette (outline) of a person is detected from image data, and a shape context histogram that is one of shape features is extracted from the detected silhouette. In the related art, a classifier is formed for each posture of an operation to be classified, with a variance-covariance matrix of the extracted histogram being as input. With this configuration, the related art can estimate the posture state of the person regardless of the position and orientation of the person.