PTL 1 discloses a technique for detecting a human from image data, and tracking body parts of the human for a plurality of successive frames, for example. According to the technique in PTL 1, features extracted from the image data and features of body parts of human learnt in advance (for example, luminance edge information, color information, texture information and others) are matched. The technique according to PTL 1 uses the matching to determine a plurality of regions with high possibility of presence of a body part (hereafter referred to as “body part frame”), and to find out a relative positional relationship between body part frames determined. Subsequently, the technique according to PTL 1 tracks each body part based on the relative positional relationship between the body part frames determined over a plurality of frames.