Natural user interfaces (NUI) have captured the imagination of many, as they shift the paradigm of human-computer interaction away from the traditional mouse and keyboard, towards more expressive input modalities. Whilst the term is broad and can encompass touch, gesture, gaze, voice, and tangible input, NUI often implies leveraging the dexterity that the higher degrees-of-freedom (DoF) of our hands allow for interaction.
With the advent of consumer depth cameras, many new systems for in-air interactions coupled with surface-based interactions have appeared. Depth cameras estimate depth by projecting dynamic patterns onto a scene and capturing images of the projected patterns with a stereo camera. Using pattern recognition, the camera system is able to estimate depth based on discrepancies between the positions of recognized patterns in each pair of left and right input images captured by the stereo camera.
To obtain such discrepancies, the camera system uses stereo image processing or stereo-matching algorithms to identify corresponding points in each pair of input images (left and right) captured by the stereo camera, where the points in the two input images are projections from the same scene point.
Stereo matching algorithms, along with the subsequent computation of depth, may incur significant computational cost. Furthermore, to deal with movement, patterns need to be projected and imaged at high frame rates, which involves expensive hardware.
Researchers continue to look for new ways of reducing computational and procurement costs, whilst retaining or increasing precision.
The examples described below are not limited to implementations which solve any or all of the disadvantages of known natural user interface (NUI) technologies.