For active electronic machines such as robots of human and animal types, attention has in recent years been drawn to active senses of vision and audition. A sense by a sensory device provided in a robot for its vision or audition is made active (active sensory perception) when a portion of the robot such as its head carrying the sensory device is varied in position or orientation as controlled by a drive means in the robot so that the sensory device follows the movement or instantaneous position of a target or object to be sensed or perceived.
As for active vision, studies have diversely been undertaken using an arrangement in which at least a camera as the sensory device holds its optical axis directed towards an object by being controlled in position by the drive means while permitting itself to perform automatic focusing and zooming in and out relative to the object to take a picture thereof.
As for active audition or hearing, at least a microphone as the sensory device may likewise have its facing kept directed towards a target or object by being controlled in position by the drive mechanism to collect a sound from the object. Such active audition may refer to visual information to determine the direction in which the sound source lies, as disclosed by the present applicant in Japanese patent application No. 2000-22677 entitled “Robot Auditory System”.
By the way, the active vision and audition are closely related to a motor control module for changing the direction of a robot (in a horizontal plane). In order to make its active vision and audition work with respect to a specific object, it is necessary to direct the robot towards the specific object, i.e., to make an attention control.
Combining vision and audition with a motor control module in turn requires processing data in real time to make track for vision and audition. In the conventional robot development efforts, however, while there has been developed a real-time processing system for a single sound source object, no attempt has been made to develop an active auditory system in which data are processed in real time in a situation, e.g., that people are talking to each other, to identify each individual person.
For a robot to precisely identify each individual speaker as a specific object on the basis of its environmental conditions requires visual and auditory data to be integrated. No active auditory system has been developed in which such data are processed in real time in a situation, e.g., that people are talking to each other, to identify each individual person.
For vision and audition to be united with controlling a motor control module, not only is it necessary to process data in real time to make track for vision and audition, but also it is extremely useful to process in real time data for the internal state and to visualize it in such a visual and auditory tracking process. In the conventional robot development efforts, however, while there has been developed such a real-time processing system for a single sound source object, no attempt has been made to develop an active auditory system in which such data are processed in real time in a situation, e.g., that people are talking to each other, to identify each individual person, nor has there been any attempt to make such visualization in real time.
Also, while the attention control for a drive motor in a motor module has so far been undertaken using either vision servo or auditory servo as it is called, no system has been proposed whereby a robot is accurately controlled using visuoauditory servo, i.e., using both its vision and audition concurrently.