1. Field of the Invention
The present invention relates to computer interfaces, and more particularly to a real-time gesture interface for use in medical visualization workstations.
2. Discussion of the Prior Art
In many environments, traditional hands-on user interfaces, for example, a mouse and keyboard, for interacting with a computer are not practical. One example of such an environment is an operating theater (OT) where there is a need for strict sterility. A surgeon, and everything coming into contact with his/her hands must be sterile. Therefore, the mouse and keyboard may be excluded from consideration as an interface because they may not be sterilized.
A computer may be used in the OT for medical imaging. The interaction can include commands to display different images, scrolling through a set of two-dimensional (2D) images, changing imaging parameters (window/level), etc. With advances in technology, there is a growing demand for three-dimensional (3D) visualizations. The interaction and manipulation of 3D models is intrinsically more complicated than for 2D models even if a mouse and keyboard can be used, because the commands may not be intuitive when working in 3D. Examples of commands in a 3D medical data visualization environment include rotations and translations including zoom.
Areas of human-machine interaction in the OT include, for example, voice recognition and gesture recognition. There are several commercial voice recognition systems available. In the context of the OT, their advantage is that the surgeon can continue an activity, for example, a suture, while commanding the imaging system. However, the disadvantage is that the surgeon needs to mentally translate geometric information into language: e.g., “turn right”, “zoom in”, “stop”. These commands need to include some type of qualitative information. Therefore, it can be complicated and tiresome to achieve a specific 3D orientation. Other problems related to voice recognition are that it may fail in a noisy environment, and the system may need to be trained to each user.
Researchers have attempted to develop systems that can provide a natural, intuitive human-machine interface. Efforts have been focused on the development of interfaces without mouse or device based interactions. In the OT, the need for sterility warrants the use of novel schemes for human-machine interfaces for the doctor to issue commands to a medical imaging workstation.
Gesture recognition includes two sequential tasks, feature detection/extraction and pattern recognition/classification. A review of visual interpretation of hand gestures can be found in V. I. Pavlovic, R. Sharma, and T. S. Huang, “Visual interpretation of hand gestures for human-computer interaction, A Review”, IEEE Transactions on Pattern Analysis and Machine Intelligence, 19(7):677–695,July 1997.
For feature detection/extraction, applications may use color to detect human skin. An advantage of a color-based technique is real-time performance. However, the variability of skin color in varying lighting conditions can lead to false detection. Some applications use motion to localize the gesture. A drawback of a motion cue approach is that assumptions may be needed to make the system operable, e.g., a stationary background and one active gesturer. Other methods, such as using data-gloves/sensors to collect 3D data, may not be suitable for a human-machine interface because they are not natural.
For pattern recognition and classification, several techniques have been proposed. Hidden Markov Model (HMM) is one method. HMM can be used for, for example, the recognition of American Sign Language (ASL). One approach uses motion-energy images (MEI) and motion-history images (MHI) to recognize gestural actions. Computational simplicity is the main advantage of such a temporal template approach. However, motion of unrelated objects may be present in MHI.
Neural networks are another tool used for recognition. In particular, a time-delay neural network (TDNN) has demonstrated the capability to classify spatio-temporal signals. TDNN can also be used for hand gesture recognition. However, TDNN may not be suitable for some environments such as an OT, wherein the background can include elements contributing to clutter.
Therefore, a need exists for a system and method for a real-time interface for medical workstations.