Computing and communication devices including mobile phones have changed substantially in the last few decades. The emergence of personal computing in the later 1970s including both personal software (productivity applications, such as text editors and spreadsheets, and interactive computer games) and personal computer platforms (operating systems, programming languages, and hardware), made everyone in the world a potential computer user. Human interaction with computers has come a long way from keyboard, mouse to touch screen and to hand gesture(s).
Using hand gestures has always been a powerful human-to-human communication modality. The expressiveness of hand gestures also allows for the altering of perceptions inhuman-computer interaction. Gesture recognition allows users to perceive their bodies as an input mechanism, without having to rely on the limited input capabilities of the devices. Possible applications of gesture recognition as ubiquitous input on a mobile phone include interacting with large public displays or TVs (without requiring a separate workstation) as well as personal gaming with LCD video glasses.
The prior art relates to the way a human could interact with a computer (such as a wearable or mobile device) using hands. Hand gestures are a natural way to communicate, and in fact some information can be passed via hand signs faster and simpler than any other way. As an example, major auction houses use hand gesture for bidding on multi-million auctions. Thus it seems natural that, as you see the information in front of you, you can use it with your hands.
Many gesture recognition algorithms have been implemented such as algorithms based on the color of the hand and using the HSV: Dadgostar, Farhad, and Abdolhossein Sarrafzadeh. “An adaptive real-time skin detector based on Hue thresholding: A comparison on two motion tracking methods.” Pattern Recognition Letters 27, no. 12 (2006): 1342-1352. Mittal, Arpit, Andrew Zisserman, and Philip Ton. “Hand detection using multiple proposals.” (2011).
Others have identified also ways to extract hands from the background using the hull and convexity defects with a static camera (like on a robot) recognition of the hands is possible: Pulkit, Kathuria, and Yoshitaka Atsuo. “Hand Gesture Recognition by using Logical Heuristics.” HCI, 2012, no. 25 (2012): 1-7. Wang, Chieh-Chih, and Ko-Chih Wang. “Hand Posture recognition using Adaboost with SIFT for human robot interaction.” In Recent progress in robotics: viable robotic service to human, pp. 317-329. Springer Berlin Heidelberg, 2008.
Another method is by using the facial detection (which is not useful when camera sits on user's body, like on the shoulder, top of the head, pocket or glasses): Dardas, Nasser H., and Nicolas D. Georganas. “Real-time hand gesture detection and recognition using bag-of-features and support vector machine techniques.” Instrumentation and Measurement, IEEE Transactions on 60, no. 11 (2011): 3592-3607.
Conventionally, commercial systems such as Microsoft Kinect™ use stereo-vision combined with infrared light. This means that a light emitting diode (“LED”) emits invisible light on specific frequency, and two cameras, a small distance from each other, capture the image at that exact light frequency. As the object closer to the camera produces or reflects significantly more light than those objects behind the object closest to the camera, it is easy to extract foreground images or objects from background images or objects and hence recognize the hands. In addition, two cameras capture two images, overlying them to correctly give a precise distance of the each point of an object providing a 3D picture. This system has superior recognition but it has drawbacks such as extra energy usage, bigger size, and more expensive. Another approach seen in some systems is to use special sensors (such as proximity, movement or still background etc.) that can capture movement and translate it into commands. These sensors can be on the user, inside the clothes, or in the proximity of the user, for instance on a desk near the user. These systems are complex to set up and expensive in terms of cost of materials as well as the energy usage.
Hence, there exists a need for a system and method that detects where the user's hands are, interprets the hand gestures in real-time, and is inexpensive. Also, there is a need for a system that overcomes user environmental variations such as exposure, lighting, background color, back-light, different user hands, skin color or wearing of gloves.