The method for recognizing human poses from an image can be classified as the model-based method and the learning-based method according to the technical principle. In the model-based method, first a human model consisting of various body parts of a human is established. A process of pose recognition is a process of searching for and matching the most similar pose in a feature space using the model. The process of searching is typically transformed into a problem of nonlinear optimization (for example, refer to non-patent literature 1) or a problem of probability density estimation (for example, refer to non-patent literatures 2 and 3). Since the number of dimensions of a pose space is extremely huge, this method may achieve a better effect only by combining with tracking in general. Accordingly, the effect of the pose recognition depends to a great extent on initialization situation of the model before tracking. Generally, this method also needs to know beforehand areas of various body parts of a human. In the learning-based method, a three dimensional human pose is concluded directly from image features. Image features which are widely used include body contour information (refer to non-patent literatures 4, 5 and 6). In order to obtain reliable contour information, the methods which have been adopted include motion analysis (refer to non-patent literature 4), background modeling (refer to non-patent literature 5) or a combination thereof (refer to non-patent literature 6). However, in the case that the background is rather complex, it is difficult to separate reliably the body contour for these methods. Other features which have been used also include truck detection (refer to non-patent literature 7), complexion information (refer to patent literature 1), and so on.
All the existing methods of human pose recognition treat body parts as a whole to perform pose recognition directly from an image. However, in a specific application scene, it is difficult to achieve high recognition accuracy through performing pose recognition merely from the image due to significant differences in clothing and figures of the body and complexity of the application environment. Moreover, since depth information can not be obtained accurately from a monocular two-dimension image, the recognition accuracy is further reduced.    [Non-patent literature 1] J. M. Rehg and T. Kanade, “Model-based tracking of selfoccluding articulated objects”, ICCV, pages 612-617, 1995    [Non-patent literature 2] H. Sidenbladh, M. J. Black, and D. J. Fleet, “Stochastic tracking of 3d human figures using 2d image motion”, ECCV (2), pages 702-718, 2000    [Non-patent literature 3] Mun Wai Lee, Cohen, I., “A model-based approach for estimating human 3D poses in static images”, IEEE TPAMI 28(6), pages 905-916    [Non-patent literature 4] A. Agarwal and B. Triggs, “3d human pose from silhouettes by relevance vector regression”, CVPR, vol 2, pages 882-888, 2004    [Non-patent literature 5] R. Rosales and S. Sclaroff, “Learning body pose via specialized maps”, NIPS, 2002    [Non-patent literature 6] K. Grauman, G. Shakhnarovich, and T. Darrell, “Inferring 3d structure with a statistical image-based shape model”, ICCV, 2003    [Non-patent literature 7] X. Ren, A. C. Berg, and J. Malik, “Recovering Human Body Configurations using Pairwise Constraints Between Parts”, ICCV 2005    [Patent literature 1] YANG MING-HSUAN (US); HUA GANG (US), “HUMAN POSE ESTIMATION WITH DATA DRIVEN BELIEF PROPAGATION”, Publication number: WO2006052853