Conventional human-to-computer interfaces include hardware control system interfaces, such as, keyboards, mice, remote control, pads, touch screens and pointing devices. With such interfaces, a physical action needs to be performed on the hardware device itself, for example, touching, moving, holding, pointing, pressing, moving, clicking, or even a plurality of these actions together, sequentially or simultaneously, in a way enabled by these device interfaces so that control commands, such as, triggered binary events or continuous values, can be sent to a computer system with which the interface is intended to interact.
The computer system often comprises a graphical user interface (GUI) having windows, buttons and other items or elements, all together termed the parameters, which are displayed on screens for providing visual feedback to a user as a function of control commands triggered and executed; they are designed in accordance with usability and ergonomics of conventional human-to-computer hardware interfaces and with respect to two-dimensional capabilities of mainstream display systems. For instance, operating systems have basically two-dimensional GUI windows which often comprise scroll bars for enabling navigation within a media content, such as, a map, an image or a text box, the size of which being potentially larger than the image displayed within the area delimited by the size of the display screen size itself. Interaction with the scroll bars is optimized for using a wheel on a mouse hardware device, or by combining motion of the mouse cursor with a holding click action. In addition, conventional GUIs often comprise two-dimensional buttons on which a user clicks with mouse buttons for zooming into and out of the content of the GUI when the mouse cursor representation is pointing the specifically determined button area.
Moreover, conventional two-dimensional GUIs may also comprise map navigation GUI interactions which usually require a click combined with a continuous mouse movement to make the map scroll as function of the mouse movement or to change from one map area to the other.
More recently, conventional two-dimensional GUIs have been developed in order to be operated by touch and/or multi-touch control interfaces, such as, multi-touch enabled surface and display screens. The control commands of these second generation touch-gesture based interfaces have been designed for enabling a user to interact, click, scroll or zoom in and out, using at least one portion of at least one hand, for example, a finger, and may be based on different kinds of hardware technologies, such as capacitive, resistive, infra-red grid, optical imaging, dispersive signal or acoustic wave based technologies.
Even more recently, a third generation of control system interfaces have become available. This generation comprises contactless interaction systems. These systems may be based also on capacitive motion-tracking sensor and comprise a system including electrodes and interface electronics. The main advantage of the use of such capacitive sensors over existing control systems is that they have low power consumption, provide seamless integration, and are of low cost. However, capacitive sensors only enable very close range contactless interactions, for example, within a distance of between 0 cm and 10 cm from the plane of the electrodes, with the capability to distinguish and track a very limited number of point of interests or extremities at the same time, such as human fingers, typically only one or two. These capacitive motion-tracking sensors are commonly associated with another interaction system from the first or second generation of control interfaces, such as a touch screen system, in order to enable both touch and touch-less or contact-less gesture interactions. However, such sensors are not complementary enough for being used efficiently for combining touch and touch-less three-dimensional gesture recognition where control gestures are performed in the air by a user, for example, using both hands and a plurality of fingers, for example 6, at varying distances between 0 cm and 150 cm from an interaction surface.
These third generation of contactless interaction systems may also be based on an imaging system, for example, two-dimensional or three-dimensional camera devices, for sequentially capturing images of a scene with respect to time, and, a method for determining three-dimensional gestures performed by a user within the captured scene. Such contactless interactions systems are compliant for being used in combination with existing conventional hardware interfaces, such as touch-screen displays, or, optionally, alone by triggering the same control commands as said conventional hardware interfaces but from a set of recognized three-dimensional gestures, namely static poses or dynamics poses, within the sequentially captured images of the scene.
One such multi-modal interaction system utilizing a 3D camera based touch-less gesture recognition system combined with another hardware device interactive system is described in WO-A-2013/104681. In WO-A-2013/104681, a novel hand-held wireless remote control device system is described. It can be used to provide conventional hardware-based remote control signals for interacting with a computer system in association with three dimensional gesture-based control signals provided by the gesture recognition system. The hand-held wireless remote control device comprises a housing having a sensing unit, and having at least one control button which is capable of generating or triggering a control signal for the associated computerized system. The computerized system uses information obtained from the control device together with information obtained from a gesture recognition system in a multi-modal way to resolve any ambiguities due to, for example, occlusion of the hand performing the gesture or the hand being outside the field of view of the imaging system associated with the computerized system, and to trigger interactions within the gesture based interaction system. Operated in a multi-modal way, the two different interaction systems are used efficiently in combination and each is delivering signals to be used for enhancing the signals from the other thereby enabling an enhanced human-to-computer interaction which cannot be provided if using only one of the two interaction systems.
Another contactless interaction system uses a video camera and a computer screen system is described in WO-A-99/40562. The system comprises a touch-screen like data entry system determined from video images comprising data relating to objects approaching the computer screen. The video camera system is mounted above the computer screen for monitoring the area immediately in front of the screen. Processing of the images enables detection and tracking of a hand of user or of a pen within the foreground of the screen using common background removal techniques. A calibration process is used in which calibration points are located so that they cover most of the screen, the calibration process generating screen spatial coordinates by transforming virtual space coordinates of the tracked hand position using means like linear interpolation and linear extrapolation.
In WO-A-02/03316, a passive capacitive touch screen is associated with at least a stereo vision camera based contactless interaction system. The low resolution, temperature and humidity dependence with low sealability capacitive touch system data are improved by information retrieved by the camera. The stereo vision camera based contactless interaction system comprises at least two cameras with overlapping fields of view which encompass the capacitive touch screen surface. The cameras acquire images of the touch surface from different locations and determine the exact location of a pointer relative to the touch surface when that pointer is captured in images acquired by the cameras. A calibration routine is used to facilitate object position determination using triangulation and taking into account offset angles of the cameras with respect to the touch surface. This enables enhanced determination of whether a pointer is in contact with the touch surface at a given point or hovering above the touch surface.
Whilst existing human-to-computer interactive systems enable multi-modal interactions based on touch interfaces and touch-less three dimensional gesture interfaces by associating at least two sensing systems having different technologies, for example, a capacitive touch screen associated with a three-dimensional touch-less gesture recognition system operated using depth information from a three-dimensional camera, there is still no solution for enabling accurate, reliable, efficient and cost effective multi-modal touch and touch-less three dimensional gesture based interfaces for controlling a computerized system in the same manner as a system utilizing a combination of different existing technologies.
Furthermore integration of two sensing systems having different technologies with a graphical user interface is always constrained by one of the technologies. For instance, when using a capacitive display screen for enabling touch-gesture interaction, the screen used has the main graphical user interface, and, adding another graphical user interface which, for instance, may have scalability properties, such as a projection system, requires adding complexity and cost to the existing system. In the same way, associating a plurality of display screens with a plurality of sensing systems does not provide a complete system being versatilite and embeddable, thereby enabling the interactive system to be operated anywhere and on any surface.
Last but not least, as integrating a plurality of sensing systems is constrained and made complex by the display system required by one of the sensing system, integration of natural interaction using a combination of both touch and touch-less three-dimensional gesture for operating in a natural way the multi-modal human-to-machine (or computer) interface tends to be rather limited in application, usability and in the ergonomics of the interaction process.