1. Field of the Invention
The present invention generally relates to image processing as well as machine vision-based human-machine interaction, and more particularly related to an open-or-closed palm gesture recognition method and an open-or-closed palm gesture recognition device as well as a human-machine interaction method and a human-machine interaction apparatus.
2. Description of the Related Art
A hand gesture is a kind of natural and intuitive communication way. The hand gesture may be used to carry out interaction with an electronic apparatus without assistance of any additional apparatus. A hand gesture recognition technique on the basis of computer vision has been widely utilized in human-machine interaction. The hand gesture recognition technique on the basis of computer vision may accept the input of a visualized image, and may output the type of hand gesture or hand action. In this way, an apparatus controlled by a computer may interpret a hand gesture or hand action as an instruction (command) so as to achieve an operation of human-machine interaction such as a turn-on/turn-off operation, a click operation, a touch operation, or a switch operation.
In patent reference No. 1 (U.S. Pat. No. 7,821,541B2), a method of recognizing two gestures of a hand is disclosed. The two hand gestures are a closed fit and an open palm. The recognition mentioned in the reference refers to one carried out with respect to fingers corresponding to a static (still) open palm and a static closed fit. In this method, only a single static feature is utilized. For example, an “open” state is determined on the basis of whether or not there are three continuous extended fingers approaching another finger (a fourth finger). However, in an actual system, it is not easy to obtain a clear outline image of a hand (in general, influenced by distance, accuracy of an apparatus, the lighting condition, etc.). As a result, this method is not robust. In addition, this method carries out the recognition only on the basis of a single image (frame).
In a non-patent reference (Zhou Ren, “Robust Hand Gesture Recognition Based on Finger-Earth Mover's Distance with a Commodity Depth Camera”, Proceedings of the 19th ACM International Conference on Multi-Media, MM'11, ACM, New York, N.Y., USA, 2011, pp. 1093-1096), a time-series curve is adopted for expressing the shape information of a hand. This time-series curve includes distances between points on the outline of the hand and the center of the hand. In addition, in this paper, a so-called “finger-earth mover's distance operator” is defined for calculating the similarity of two hand shapes. However, this method carries out the recognition by employing a template matching technique.
In the recognition of a palm gesture, aside from the template matching and the intuitive way, another well-used method is a classifier technique on the basis of machine learning. The classifier technique on the basis of machine learning may provide a robust recognition effect. The reason is that this kind of technique comprehensively considers the influence of various features with respect to the classification. As a result, the classifier technique has been widely utilized in the recognition of static gestures and dynamic gestures.
In patent reference No. 2 (Chinese Patent Application No. 201200147172), a hand gesture recognition method on the basis of classifiers is disclosed. This method adopts the so-called “depth difference distribution operator” to extract a CDDD feature from a few adjacent images, for describing the depth difference distributions before and after a hand action. The CDDD feature is a multi-dimensional feature vector whose dimension depends on the number of images (frames) adopted in a hand gesture recognition unit. For example, if three images are used as one hand gesture recognition unit, then the dimension of the feature vector is 128, and if four image are used as one hand gesture recognition unit, then the dimension of the feature vector is 192. However, as for this technique, on the one hand, it is necessary to apply a large number of samples to machine learning so as to obtain the multi-dimensional feature vector; on the other hand, if the number of frames changes, then in general it is impossible to carry out the hand gesture recognition well. In addition, this method may only use depth images; as a result, the application of this method with respect to the conventional color images is limited.