Hand movements and hand signals are natural forms of human expression and communication. The application of this knowledge to human-computer interaction has led to the development of vision-based computer techniques that provide for human gesturing as computer input. Computer vision is a technique providing for the implementation of human gesture input systems with a goal of capturing unencumbered motions of a person's hands or body. Many of the vision-based techniques currently developed, however, involve awkward exercises requiring unnatural hand gestures and added equipment. These techniques can be complicated and bulky, resulting in decreased efficiency due to repeated hand movements away from standard computer-use locations.
Current computer input methods generally involve both text entry using a keyboard and cursor manipulation via a mouse or stylus. Repetitive switching between the keyboard and mouse decreases efficiency for users over time. Computer vision techniques have attempted to improve on the inefficiencies of human-computer input tasks by utilizing hand movements as input. This utilization would be most effective if detection occurred at common hand locations during computer use, such as the keyboard. Many of the current vision-based computer techniques employ the use of a pointed or outstretched finger as the input gesture. Difficulties detecting this hand gesture at or near the keyboard location result due to the similarity of the pointing gesture to natural hand positioning during typing.
Most current computer vision techniques utilize gesture detection and tracking paradigms for sensing hand gestures and movements. These detection and tracking paradigms are complex, using sophisticated pattern recognition techniques for recovering the shape and position of the hands. Detection and tracking is limited by several factors, including difficulty in achieving reasonable computational complexity, problems with actual detection due to ambiguities in human hand movements and gesturing, and a lack of support for techniques allowing more than one user interaction.