The present disclosure relates to an image processing apparatus and an image processing method, and, in particular, to an image processing apparatus and an image processing method which are able to achieve fast and stable processing even in the case where poses of a plurality of people are estimated simultaneously.
There is a known pose estimation technique which performs pose estimation by applying, through energy optimization, a human body model to a silhouette region of a moving subject portion extracted from an input image supplied from a camera or the like by a background subtraction algorithm or the like (see, for example, “Articulated Body Motion Capture by Stochastic Search” (International Journal of Computer Vision, 2005), JONATHAN DEUTSCHER AND LAN REID, Department of Engineering Science, University of Oxford, Oxford, OX13PJ, United Kingdom, Received Aug. 19, 2003).
There is also a known technique which performs optimization of human body models while employing a visual hull technique, which involves a three dimensional projection of silhouettes of a plurality of moving subject portions to estimate three-dimensional shapes representing human body portions (see, for example, “A Markerless Motion Capture System to Study Musculoskeletal Biomechanics: Visual Hull and Simulated Annealing Approach,” S. Corazza, L. Mundermann, A. M. Chaudhari, T. Demattio, C. Cobelli, T. P. Andriacchi, Annals of Biomedical Engineering—ANN BIOMED ENG, vol. 34, no. 6, pp. 1019-1029, 2006).