Currently, the preeminent user interface mechanism for control over user computing devices (such as smartphones, tablets, laptops and desktop systems) is the graphical user interface, which is often deployed together with a pointing-based or touch-based user interface. While the graphical user interface offers a convenient and understandable interface with the computing device's underlying functions, thanks to its desktop metaphor, the fact remains that the human-machine interface is distinctly different from natural interpersonal communication. Even the use of touch-based control requires some amount of user training so that the user learns how to correlate touches and gestures with the commands controlling the device.
Interpersonal communication is largely speech and gesture-based, which speech and gesture or context being received concurrently by the listener. To date, there has been limited research on concurrent speech and gesture processing, and generally, the approach has been focused on receiving concurrent input, but then combining the speech and gesture only after the each of the speech and gesture input had been separately processed.