1. Field of the Invention
This invention relates to an object detector for detecting an object of detection such as a face or some other object, an object detecting method and a robot equipped with such an object detector. More particularly, the present invention relates to an object detector and an object detecting method that can reduce the risk of erroneously detecting other than the object of detection and also to a robot that is equipped with such an object detector.
This application claims priority of Japanese Patent Application No. 2003-307924, filed on Aug. 29, 2003, the entirety of which is incorporated by reference herein.
2. Related Background Art
A “robot” is a machine that is electrically or magnetically driven to do movements that resemble human (living body) actions. In Japan, robots became popular in the late 1960s but many of them were industrial robots including manipulators and transfer robots that were designed to realize automated and unattended production lines in factories.
Recently, efforts have been and being paid to develop utility robots that support people as partners, behaving like men, in various scenes of our daily lives in living environments. Unlike an industrial robot, a utility robot has the ability of learning how to adapt itself to various people having different personalities and various circumstances in different scenes of our daily lives in living environments. For example, “pet type” robots that resemble four-footed animals such as dogs and cats in terms of bodily mechanisms and movements and “human type” or “human-shaped” robots that are designed by using men or some other two-footed animals as model in terms of bodily mechanisms and movements are already being in the stage of practical use.
If compared with industrial robots, utility robots can act to entertain people so that they are sometimes also referred to as entertainment robots. Additionally, such robots are provided with various external sensors including CCD (charge coupled device) cameras and microphones so that they can recognize external circumstances on the basis of the outputs of the external sensors and autonomously act in response to external information and internal conditions.
It may be most desirable for an entertainment type robot if it can detect the face of the partner who is talking to the robot or that of the person who is moving and comes into the sight of the robot and talk back to the person or respond to the person by action, seeing the face of the person, because such a behavior is very natural and hence very entertaining to us.
Up until now, various proposals have been made to detect the face of a man, using only gradation patterns based on video signals, without relying on colors and motions in a moving and hence complex image.
The proposed face detection methods include a method of generating a discriminator by causing the robot to learn facial patterns in advance by means of pattern recognition techniques such as proper faces, neural networks and support vector machines (SVM).
Japanese Patent Application Laid-Open Publication No. 2002-42116
However, with the method of generating a discriminator, while the robot is highly robust relative to changes in the environment and in its own posture and countenance in the course of pattern recognition that relies on the ability developed by learning a vast volume of face image data, the volume of computational operation required for the pattern recognition is correspondingly enormous so that a long time has to be spent for the computational operation.
Actually, in a process of detecting a face image of a person out of a picked up image (to be referred to as face detection task hereinafter), the face image is discriminated as it is cut out from the picked up image. For this purpose, the picked up image is scanned in various different scales. Therefore, it is highly important to minimize the quantity of the computing operation required for discriminating a single pattern.
For example, in the case of a face detection task using pattern recognition of a support vector machine, it is necessary to computationally obtain the inner product of hundreds of support vectors (400 dimensions) relative to the vectors of 400 dimensions acquired from an image of about 400 (=20×20) pixels cut out from the picked up image. If this operation is conducted in the entire screen having dimensions (W, H), the computational operation for obtaining the inner product have to be repeated (W−20+1)×(H−20+1) times. Then, the overall operation will amount to an enormous volume.
When a robot utilizes a face detection task, it will be very difficult to feedback the acquired data to make the robot behave on a real time basis unless a face image is detected satisfactorily quickly. The internal CPU of the robot is constantly loaded with a large number of tasks it has to perform other than the face detection task. In other words, the CPU cannot dedicate its entire computational ability to the face detection task because it also has to take care of the large number of task.
In the case of an image recognizer for recognizing an object, which may be something other than a face, from the input image, it is normally designed to recognize an object as right object when the recognized object satisfies certain conditions in terms of resemblance to and characteristics of the right object. However, when rigorous conditions are provided for the detection of the right object, the robot can miss the right object although it is there. When, on the other hand, loose conditions are provided for the detection of the right object, the robot can detect a number of wrong objects. In other words, missing the right object and detection errors are tradeoff and optimum conditions have to be provided by taking the performance of the recognizer, that of the camera that acquires input images, the volume of computational operation necessary for recognizing an object and so on into consideration.
It is not desirable to raise the volume of the computational operation that the robot has to perform because of the limited resources of the robot and detection errors will occur frequently if the image recognizer is required not to miss the right object.