There are a large number of systems as shown in FIG. 1, i.e., systems for identifying a speaker or the like based on a position and direction of the face of an individual person. This type of system finds the face of a user 64 based on color information or the like of images picked up by one or a few cameras 61, and performs personal identification by face, or measures the direction of the face, the direction of sight line or the like, with the help of eyes, nose, mouth or the like, which are structures in the face.
Therefore, persons who are within a recognition object region 63 would be substantially limited to users sitting and persons watching a space such as a particular display 62. This raises a problem in that this type of system can be used only in very narrow places. Also, regarding the identification of behavior of the user that is performed in conjunction with the personal identification, the direction of face, that of sight line, and nodding action form the nucleus of objects to be identified.
Also, in conventional systems as shown in FIG. 2, i.e., systems for specifying three-dimensional positions and movements of a person in an indoor space 5, the whole body of a user 72 is photographed by arranging cameras 71 (71-1, 71-2, 71-3, 71-n) to expand the object region. However, this type of system does not constitute a gesture-based interface apparatus associated with personal identification, and therefore, it would not provide interaction specified for each user.
Furthermore, conventional recognition systems as shown in FIG. 3, i.e., systems for recognizing a hand sign of a user 84 is used as an interface apparatus by acquiring an image of a hand alone by a display device 82 or the like in a fixed environment in front of a camera 81, that is, in a recognizable region 83, to thereby recognize the hand sign. This causes a problem in that this type of system is operable only in narrower and fixed environments. Therefore, this type of system would not provide a hand-sign based interface for each individual person, the interface having the function of identifying an individual and acquiring a movement/behavior log of the individual in a wide space such as a room.