Electronic systems exist for using gestures, such as those created by the movement of a hand, as input. For example, there are handwriting recognition devices that interpret a user's gesture made through a stylus or pen as input. Also, there are systems that gear users with wiring or other implements in order to track the user's hand or body movements.
There have been attempts to implement gesture recognition systems and techniques using optical sensors. For example, U.S. Pat. No. 6,252,598, describes the use of video images to identify hand gestures. A plurality of regions in the frame are defined and screened to locate an image of a hand in one of the regions. A hand image is processed to locate extreme curvature values, such as peaks and valleys, corresponding to predetermined hand positions and gestures. The number of peaks and valleys are then used to identify and correlate a predetermined hand gesture to the hand image for effectuating a particular computer operation or function. In order to find the curvature values on the hand, the boundaries of the hand must be reliably obtained. This can be problematic because the edges of an intensity image are closely related to the lighting and background properties of the scene. Furthermore, the intensity of the image makes use of the system dependent on the lighting of the scene.
U.S. Pat. Nos. 6,256,033 and 6,072,494 provide for a computer-implemented gesture recognition system. These systems require a background image model to be created by examining frames of an average background image before the subject that will perform the gesture enters the image. The necessity of having the background picture reduces the practical applicability of the method, particularly since the background can change due to movements in the background, or changes to lighting or shadows.
U.S. Pat. No. 6,222,465 describes a system and method for manipulating virtual objects in a virtual environment, for drawing curves and ribbons in the virtual environment, and for selecting and executing commands for creating, deleting, moving, changing, and resizing virtual objects in the virtual environment using intuitive hand gestures and motions. The system is provided with a display for displaying the virtual environment and with a conceptual description of a video gesture recognition subsystem for identifying motions and gestures of a user's hand.
U.S. Pat. No. 6,204,852 describes a video gesture-based three-dimensional computer interface system that uses images of hand gestures to control a computer. The system tracks motion of the user's hand or an elongated object or a portion thereof in a three-dimensional coordinate system with five degrees of freedom. The system contains multiple cameras. These cameras are not used to obtain a depth image of the scene. Instead, every camera image is processed independently, and the location of the finger is located in both of the cameras. The location of the finger is next located by the geometry between the cameras and the finger's location in each image. The orientation of the finger is determined in a similar manner. The method is intended to be used for applications with a pointing finger. Furthermore, if multiple fingers are used in the gesture, the method may not be able to unambiguously determine the corresponding fingers in each image.
U.S. Pat. No. 5,781,663 describes an image recognition apparatus that operates in three modes. A gesture recognition mode is used to recognize an input locus as a command, a figure recognition mode is used to recognize a figure, and a character recognition mode is used to recognize a character.
U.S. Pat. Nos. 5,454,043, 6,002,808, and 5,594,469, each provide a gesture recognition framework using intensity images. The patents illustrate the use of moments and frequency histograms for image representation and recognition. The algorithms described therein rely on the edges of the hands in the intensity images, and therefore the described systems are very much affected by ambient conditions of the environment, such as by a background that has similar color with the skin. For instance, the system might misinterpret the edges when there is not enough illumination on the foreground (hand) so that edges between the foreground and background disappear.