The desirability of hands free interaction with and/or control of a system or device arises in many contexts. For many individuals, this often arises when the individual is performing a task that requires or benefits from the use of both hands. A common example is operating a vehicle, such as driving a car. But another example is a task in which the potential for contamination makes switching between the primary task and manual interaction with the electronic device undesirable, such as in performing surgery or food preparation. Voice commands are often used in such situations, for example, to control a smartphone while driving, or to control a surgical robot. Voice commands have significant limitations, including limited bandwidth, degraded performance in noisy environments, and in some applications or circumstances, a loss of privacy. In some applications, such as law enforcement or military, speech may disclose the presence of the speaker.
For individuals with physical impairments and/or amputations, it may be impossible to use one's hands to interact with a system. In such cases, hands free interaction is required. For example, individuals with amyotrophic lateral sclerosis (ALS) may have little or no voluntary control of their hands, feet, or limbs. Physical impairments may also prevent an individual from effectively issuing voice commands. An effective communication system that did not rely on one's hands or voice would be an effective means of input to an electronic device or system in all these cases.
The desire to interact with an electronic system when a visual display is not in view also arises in many of these contexts. For example, it may be unsafe to look at a visual display while operating various kinds of vehicles or machinery. For an individual who is in bed, it may be impractical or undesirable to keep a visual display in sight at all times. Audio rather than visual output from the system may be desirable in such contexts.
The ALS patient would be a very compelling beneficiary of such a system and method. Although the ALS patient population is modest in size, the disease is eventually completely debilitating and patients rely heavily on technology solutions for many needs including communication. Currently available systems provide some communication capabilities, but they generally have significant limitations in terms of physical intrusiveness and the need for frequent action on the part of a caregiver to keep the systems available to the patient.
Each year approximately 5,000 people in the U.S. are diagnosed with ALS. ALS is a progressive neurological disease that is invariably fatal. It is one of the most common neuromuscular diseases, and eventually leaves patients without the ability to control voluntary movements. Ultimately, it affects the diaphragm and other muscles in the chest, causing patients to lose the ability to breathe. The loss of breath support, combined with weakness in the palate, lips, and tongue, typically causes ALS patients to lose the ability to speak. It also leads to loss of the use of limbs, and even an inability to move the head, which, combined with loss of speech ability, make communication extremely difficult. ALS does not typically affect a patient's intellect; the combination of mental awareness and the inability to communicate is a very difficult aspect of the disease for many patients.
A number of Augmentative and Alternative Communication (AAC) systems have been developed to help ALS patients communicate. These include simple, low-tech solutions, such as communication boards that might include common phrases as well as a depiction of the alphabet, and that are typically navigated with the help of an assistant who points to the various options and watches for a signal from the patient to indicate a desired selection. Mechanical switches can also be configured to allow a patient to signal for assistance using any remaining control of limbs, head, or even eyebrows or cheeks. More technologically-advanced solutions are frequently based around a computer, and provide the patient with alternate input devices for interacting with the computer. Such devices include trackballs, joysticks, touchscreens, and head mice, with the choice of appropriate device depending on the degree of physical impairment.
In the late stages of ALS, nearly all voluntary muscle control can be lost, but oculomotor control is frequently retained. This makes eye tracking and blink detection systems the leading input modalities for these patients. The appropriate input device is typically combined with specialized software to facilitate communication. Such software can employ a variety of strategies, including efficient access to commonly-used phrases through menus or abbreviations, suggesting likely word completions to reduce the number of keystrokes that are required, and scanning through selections so that a patient can simply activate a binary switch at an appropriate time to select the desired option.
Despite the relatively wide range of available technology, important capability gaps exist. Systems utilizing visual displays, and particularly those using eye tracking, only function when the patient is positioned in front of them. It is not practical for a patient to always be positioned in front of the device, yet a patient needs the ability to communicate at all times. Various types of mechanical and non-contact switches can be used, but they rely on a caregiver to consistently place them every time a patient is moved, and such devices may also be physically intrusive. Particularly when a patient has gone to bed to sleep, providing them with a robust, persistent communication capability is challenging with existing technology. Effective communication devices are particularly important in this scenario because: (1) a caregiver will likely not be close at hand to recognize the patient's needs; and (2) night-time shifts are often staffed by less experienced personnel, who may have more difficulty inferring a patient's needs.
A number of devices have been proposed for the purposes of monitoring eye movements and detecting blinks. Many of these are wearable devices, because wearable devices provide a well-defined geometry of sensors, and in some cases illuminators, relative to the eyes. Wearable devices have certain drawbacks, including the potential for discomfort and inconvenience from having to wear a device, the potential for damage to the device when the user is moved, the need for either a tether to provide power or a battery on board the wearable device that must be recharged, and in healthcare applications, the need to rely on a caregiver to properly position and adjust the wearable device and to maintain the device (charging batteries, etc.)
A second class of devices that have been proposed, which may not be wearable, are designed to function with a user who is positioned or restricted within a fairly small range of allowable positions, and whose pose must comport to a fairly small range of allowable poses, typically directly facing the sensors with illuminators positioned in particular orientations with respect to be known pose of the user. Some efforts have attempted to address the need for robustness to pose variations, the range of poses to which such conventional devices is still limited and, because of reliance on the common use of hard cascades with sequential single stage criteria, may respond inappropriately to a captured image. A captured image of features including a region of interest may be rejected by the hard cascade, if it does not satisfy the single stage criteria. Images not including a region of interest that incidentally do meet the single stage criteria may be incorrectly accepted by a hard cascade. Conventional devices do not address the problem of determining whether or not an acceptable image of a facial region of interest is present in the field of view of any one of multiple imaging sensors covering a range of poses, and if so determining which subset of sensors provides the best information. Further, because such approaches have been directed to overcoming the challenges of increasing the range of poses, such approaches have narrowly addressed limited and prominent facial regions.
Existing approaches have commonly been closely proximate to the user, with the closest embodiments being wearable devices. As discussed, very close devices are intrusive and can obstruct or limit the poses of the user.
It would be desirable to implement a non-intrusive, stand-off assistive communication system having the capabilities for users to call for assistance and to communicate basic requests effectively using a sequence of facial expression.
In various other applications, it would be desirable to provide a system and method for facial expressions to control and input to an electronic device. Apart from assistive technology, such a system and method may find ready use by operators of vehicles, medical devices, head mounted computer systems, etc., to mention just a few of the many possible application areas.