I. Field of the Invention
This invention pertains to a method and apparatus for inputting commands to a computer using hand signals. More particularly, the present invention relates to a video gesture-based computer interface wherein images of hand gestures are used to control a computer and wherein motion of the user's hand or a portion thereof is tracked in a three-dimensional coordinate system with ten degrees of freedom.
II. Description of the Related Art
Various types of computer control and interface devices exist for inputting commands to a computer. Such devices may for example take the form of a computer mouse, joystick or trackball, wherein a user manipulates the interface device to perform a particular operation such as to select a specific entry from a menu of options, perform a "click" or "point" function, etc. A significant problem associated with such interface devices is that a surface area is needed for placement of the device and, in the case of a mouse, to accommodate device movement and manipulation. In addition, such interface devices are generally connected by a cable to a computer CPU with the cable typically draped across the user's desk, causing obstruction of the user's work area. Moreover, because interface device manipulation for performing operations is not consistent with common communication movements, such as the use of a pointing finger hand gesture to select a menu entry, as opposed to maneuvering a mouse until the cursor rests on the desired menu entry, a user must become comfortable and familiar with the operation of the particular interface device before proficiency in use may be attained.
To address these drawbacks, a video interface system for enabling a user to utilize hand gestures to issue commands to a computer has been developed and is described in the commonly assigned U.S. patent application entitled "Video Hand Image Computer Interface", Ser. No. 08/887,765 of Segen, filed Jul. 3, 1997 (hereinafter "Segen"), which is hereby incorporated herein by reference in its entirety. The Segen system, by way of preferred example, utilizes a video camera or other video input device connected to an image processing computer, with the camera positioned to receive images of an object such as a user's hand. The image processing capabilities of the computer act upon predetermined recognized hand gestures as computer commands. Hand images from the camera are converted to a digital format and input to the computer for processing. The results of the processing and attempted recognition of each image are sent to an application or the like for performing various functions or operations.
However, the use of both traditional two-dimensional input devices and the Segen system is problematic in advanced computer-based three-dimensional object selection and manipulation applications. In such applications, a virtual three-dimensional environment is typically displayed to the user with one or more displayed virtual objects and command menus positioned within the virtual environment. The user may delete, move and otherwise change the objects in the virtual environment or create new objects. The user may also select various commands from the command menus. Other functions may be performed in the virtual environment such, for example as, drawing curves. Traditional input devices are extremely difficult to use in such a virtual environment because traditional devices control only two degrees of freedom, and thus a combination of several input devices is required to control three or more degrees of freedom as is necessary in three-dimensional applications. Such a combination control scheme is cumbersome, unintuitive and requires significant training on the user's part. The Segen system provides for three degrees of freedom, which is more than adequate for issuing commands, but not sufficient for use in some three-dimensional applications where interaction with three-dimensional objects is necessary. Advanced three-dimensional applications that utilize a virtual world environment displayed to the user require more degrees of freedom. In particular, an application may require the user to grasp, move, and otherwise manipulate three-dimensional virtual objects displayed to the user in the virtual world environment. To accomplish such complex tasks in a natural way, at least two of the user's fingers, such as the index finger and the thumb, must be independently tracked with five degrees of freedom for each finger.
It would thus be desirable to provide a computer interface that enables common and intuitive hand gestures and hand motions to be used for interacting with a three-dimensional virtual environment. It would further be desirable to provide a system and method for tracking hand gestures and hand motions in a three-dimensional coordinate system with ten degrees of freedom.