The present invention relates to a computer system for recognizing human gestures, more specifically to a gesture recognition system that is adapted for incorporation into a bipedal robot.
U.S. Pat. No. 5,432,417 entitled xe2x80x9cLocomotion Control System for Legged Mobile Robotxe2x80x9d, assigned to the same assignee of the present invention discloses a bipedal walking robot. A computer provided on the back of the robot controls the movement of the legs, thighs, and the trunk of the robot such that it follows target ZMP (Zero Moment Point) at which point a horizontal moment that is generated by the ground reaction force is zero. It is desired that the robot understands gestures of a human being so that a person can give instructions to the robot by gesture. More generally, it is desired that human gestures be recognized by a computer system as an input to the computer system without significantly increasing the workload of the computer system.
Japanese laid open patent application (Kokai) No. 10-31561 (application No. 8-184951) discloses a human interface system wherein hand gesture or body action is recognized and used as an input to a computer. Images of a hand and a body are captured with an image sensor which can be a CCD or an artificial retina chip. In a specific embodiment, edges of an input image are produced with the use of a random access scanner in combination with a pixel core circuit so as to recognize movement of a hand or a body.
U.S. Pat. No. 6,072,494 describes a gesture recognition system. A human gesture is examined one image frame at a time. Positional data is derived and compared to data representing gestures already known to the system. A frame of the input image containing the human being is obtained after a background image model has been created.
U.S. Pat. No. 5,594,810 describes a computer system for recognizing a gesture. A stroke is input on a screen by a user, and is smoothed by reducing the number of points that define the stroke. Normalized stroke is matched to one or more of gesture prototypes by utilizing a correlation score that is calculated for each prototype.
Technical Paper of the Institute of Electronics, Information and Communication Engineers (IEICE), No. PRU95-21 (May 1995) by S. Araki et. al, entitled xe2x80x9cSplitting Active Contour Models Based on Crossing Detection and Its Applicationsxe2x80x9d discussed about active contour models (SNAKES). It splits a contour model into plural contours by detecting self-crossing of the contour model. An initial single contour, for which an image frame can be selected, is iteratively split into multiple contours at the crossing parts, thus extracting plural subjects from the initial single contour. A contour of moving subjects can be produced utilizing the optical flow scheme, which itself is well known in the art. For example, it was discussed by Horn, B. K. P. and Schunck, B., xe2x80x9cDetermining optical flowxe2x80x9d, Artificial Intelligence, Vol. 17, pp 185-203, 1981.
Japanese laid open patent application (Kokai) No. 2000-113164(application No. 10-278346) assigned to the same assignee of the present invention discloses a scheme of recognizing a moving subject in a car by viewing an area of a seat with a CCD camera where a person may be seated. With the use of Sobel filter, an edge picture of objects in an image frame is produced. The edge picture includes edges of an upper portion of the person seated, a part of the seat that is not covered by the person, and a background view. By taking difference of two edge pictures produced from two consecutive image frames, a contour or edge of a moving subject, that is a human being, is extracted because edges of static objects disappear in the difference of the two edge pictures. The scheme is used to identify the position of the head of the person seated in a seat.
The gesture recognition system of the above-identified Kokai No. 10-31561 includes a voice input device comprising a microphone whereby a voice input is analyzed and recognized. The results of hand gesture and body action recognition and voice recognition are combined to control such apparatus as a personal computer, home electric appliances (a television, an air conditioner, and an audio system), game machine and a care machine.
In cases where a computer system executes a number of different jobs, consideration needs to be paid such that the CPU of the computer system does not become overly loaded with jobs. In the case of an on-board computer system for controlling a robot, for example, it is busy controlling the posture and movement of the robot, which includes collecting various data from many parts of the robot and computing adequate force to be applied to various actuators located at a number of joint portions. There thus is a need for a computer system that activates the gesture recognition function only when it is needed.
The present invention provides a system for recognizing gestures made by a moving subject. In accordance with one aspect of the invention, the system comprises a sound detector for detecting sound, one or more image sensors for capturing an image of the moving subject, a human recognizer for recognizing a human being from the image captured by said one or more image sensors, and a gesture recognizer, activated when human voice is identified by said sound detector, for recognizing a gesture of the human being.
In a preferred embodiment, the system includes a hand recognizer for recognizing a hand of the human being. The gesture recognizer recognizes a gesture of the human being based on movement of the hand identified by the hand recognizer. The system may further include a voice recognizer that recognizes human voice and determines words from human voice input to the sound detector. The gesture recognizer is activated when the voice recognizer recognizes one of a plurality of predetermined keywords such as xe2x80x9chello!xe2x80x9d, xe2x80x9cbyexe2x80x9d, and xe2x80x9cmovexe2x80x9d.
The system may further include a head recognizer that recognizes the position of the head of the human being. The hand recognizer determines the position of the hand relative to the position of the head determined by the head recognizer. The system may include a storage for storing statistical features of one or more gestures that relate to positions of the hand relative to the position of the head, an extractor for extracting features of the movement of the hand as recognized by said hand recognizer, and a comparator for comparing the extracted features with the stored features to determine a matching gesture. The statistical features may preferably be stored in the form of normal distribution, a specific type of probability distribution.
In a preferred embodiment, the hand recognizer recognizes a hand by determining the portion that shows large difference of positions in a series of images captured by the image sensors.
In another embodiment, the sound detector includes at least two microphones placed at a predetermined distance for determining the direction of the human voice. The human recognizer identifies as a human being a moving subject located in the detected direction of the human voice.
In accordance with another aspect of the invention, a robot is provided that incorporates the system discussed above. The robot is preferably a bipedal walking robot such as discussed in the above-mentioned U.S. Pat. No. 5,432,417, which is incorporated herein by reference.