Traditionally, users have interacted with electronic devices (such as a computer or a television) or computing applications (such as computer games, multimedia applications, or office applications) via indirect input devices, including, for example, keyboards, joysticks, or remote controllers. The user manipulates the input devices to perform a particular operation, such as selecting a specific entry from a menu of operations. Modern input devices, however, include multiple buttons, often in a complex configuration, to facilitate communication of user commands to the electronic devices or computing applications; correct operation of these input devices is often challenging to the user. Additionally, actions performed on an input device generally do not correspond in any intuitive sense to the resulting changes on, for example, a screen display controlled by the device. Input devices can also be lost, and the frequent experience of searching for misplaced devices has become a frustrating staple of modern life.
An alternative mode of interaction involves recognizing and tracking the intentional movement of a user's hand, body, or any other object as it performs a gesture, which can be interpreted by the electronic device as user input or a command. For example, a motion-capture system can track the position of an object by acquiring one or more images of a spatial region that includes the object, panning or zooming the image-capture device so that the object remains in the field of view.
Many sophisticated or nuanced gestures or motions, however, cannot easily be tracked, identified, or interpreted by these systems. A user can make large, broad gestures one moment followed by small, fine-tuning gestures. The capturing camera and/or supporting system may not be able to react or reconfigure itself quickly enough to capture, or assign meaning to, both kinds of gestures in quick succession. If the camera is zoomed out, for example, it can miss the subtleties of small gestures, whereas if the camera is zoomed in, it can fail to capture larger motions that stray outside the field of view. A need therefore exists for systems and methods capable of responsively adjusting to gestures that rapidly change in scale.