Techniques of face recognition can be widely applied to man-machine interfaces, for example, for a personal authentication system that does not impose a burden on users, and for distinction of gender. Although recognition techniques based on side faces have been considered initially, recognition techniques based on front images are now most common.
Researches on face recognition are considered as benchmarks for verifying theories of pattern recognition, and thus various methods there of face recognition have been developed. Most applications assumed in the development, such as authentication in security systems, and search for a person from a large-scale database, have required accuracy under static environments.
Recently, robot apparatuses for entertainment, with appearances similar to those of animals such as dogs, have been available. For example, on Nov. 25, 2000, Sony Corporation announced “SDR-3X”, which is a two-legged mobile humanoid robot. Legged mobile robots of this type are unstable and it is difficult to control attitude and gait thereof; however, advantageously, the robots are allowed to go up and down stairs or ladders and to overcome obstacles, achieving flexible walking and running irrespective of distinction between leveled and unleveled grounds. Furthermore, with improved intelligence that allows robots to operate autonomously, it becomes not impossible for people and robots to live together in the same living space.
Intelligent robot apparatuses are capable of exhibiting animal-like behavior by autonomously operating eyes and legs in accordance with external information (e.g., information regarding circumstances), internal status (e.g., emotional status), etc.
The emergence of such robot apparatuses raised demand for human interface techniques that allow response within a predetermined time under a dynamically changing operating environment, one of which is face discrimination by a robot apparatus. For example, by using face discrimination, a robot apparatus is allowed to discriminate a user (owner, friend, or legitimate user) from among many, and a higher level of entertainment is achieved, for example, by changing reactions on the basis of individual users.
Techniques of face recognition in robot apparatuses, as opposed to applications to authentication in security systems and search for a person from a large-scale database, require response within a predetermined time under a dynamically changing operating environment even at the cost of somewhat less accuracy.
An application of face discrimination in such a robot apparatus needs to solve the following problems in addition to the problem of discriminating a person from a given scene.                (1) Since the robot apparatus itself moves, change in and diversity of environment must be accepted.        (2) Since the relative positions of a person and the robot apparatus change, the person must be kept in vision during interactions.        (3) An image that is useful for discrimination of a person must be selected from a large number of scenes, and a comprehensive judgment must be made.        (4) A response must take place within a predetermined time.        
The mainstream of face recognition methods has been methods based on neural network and methods in which principal components analysis (PCA) is applied to a vector space composed of luminance values of a face image (eigenspace method). These conventional methods, however, have had the following shortcomings.
First, face recognition based on neural network does not allow incremental learning.
As for the methods based on eigenspace, although orthogonal piecewise-linear spaces are assumed (i.e., averaging two face images forms a human face), linearity is not actually provided in many cases, so that precise positioning called morphing or alignment is required. Some methods attempt to alleviate the effect by normalization with respect to position, rotation, and size in the pre-process; however, the processing does not necessarily work well. This has been a factor that deteriorates recognition performance. Furthermore, since the degrees of face space are significantly reduced to allow separation in that space, features of high degrees might be lost.
Furthermore, each of the recognition methods described above is susceptible to effects of change in lighting conditions, change in camera parameters, noise, position, and rotation, so that pre-processing such as noise filtering and morphing is required. Also, question remains regarding the ability of generalization.