Field of the Invention
The present invention relates generally to computer based eye-tracking systems. More particularly the invention relates to an arrangement for controlling a computer apparatus according to the claims and a corresponding method according to the claims. The invention also relates to a computer program according to the claims.
Description of Related Art
The human computer interaction was revolutionized by the introduction of the graphical user interface (GUI). Namely, thereby, an efficient means was provided for presenting information to a user with a bandwidth which immensely exceeded any prior channels. Over the years the speed at which information can be presented has increased further through color screens, enlarged displays, intelligent graphical objects (e.g. pop-up windows), window tabs, menus, toolbars and sounds. During this time, however, the input devices have remained essentially unchanged, i.e. the keyboard and the pointing device (e.g. mouse, track ball or touch pad). In recent years, handwriting devices have been introduced (e.g. in the form of a stylus or graphical pen). Nevertheless, while the output bandwidth has multiplied several times, the input ditto has been substantially unaltered. Consequently, a severe asymmetry in the communication bandwidth in the human computer interaction has occurred.
In order to decrease this bandwidth gap, various attempts have been made to use eye-tracking devices. However, in many cases these devices miss the mark in one or several respects. One problem is that the prior-art solutions fail to take a holistic view on the input interfaces to the computer. Thereby, comparatively heavy motor tasks may be imposed on the eyes, which in fact are strictly perceptive organs. Typically, this leads to fatigue symptoms and a degree of discomfort experienced by the user. This is particularly true if an eye tracker is used to control a cursor on a graphical display, and for various reasons the eye tracker fails to track the user's point of regard sufficiently well, so that there is a mismatch between the user's actual point of regard and the position against which the cursor is controlled.
Instead of controlling the cursor directly, an eye gaze signal may be used to select an appropriate initial cursor position. The document U.S. Pat. No. 6,204,828 discloses an integrated gaze/manual cursor positioning system, which aids an operator to position a cursor by integrating an eye-gaze signal and a manual input. When a mechanical activation of an operator device is detected the cursor is placed at an initial position which is predetermined with respect to the operator's current gaze area. Thus, a user-friendly cursor function is accomplished.
The document U.S. Pat. No. 6,401,050 describes a visual interaction system for a shipboard watch station. Here, an eye-tracking camera monitors an operator's visual scan, gaze location, dwell time, blink rate and pupil size to determine whether additional cueing of the operator should be made to direct the operator's attention to an important object on the screen.
The document U.S. Pat. No. 5,649,061 discloses a device for estimating a mental decision to select a visual cue from a viewer's eye fixation and corresponding event evoked cerebral potential. An eye tracker registers a viewing direction, and based thereon fixation properties may be determined in terms of duration, start and end pupil sizes, saccades and blinks. A corresponding single event evoked cerebral potential is extracted, and an artificial neural network estimates a selection interest in the gaze point of regard. After training the artificial neural network, the device may then be used to control a computer, such that icons on a display are activated according to a user's estimated intentions without requiring any manipulation by means of the user's hands.
A few attempts have also been made to abstract user-generated input data into high-level information for controlling a computer. For example, the document US 2004/0001100 describes a multimode user interface, where a flexible processing of a user input is made possible without having to switch manually between different input modes. Instead, within the data streams different information categories are distinguished depending on a context in which the data streams are generated.
Although this strategy may indeed enhance the efficiency of the man-machine interaction, no multimodal solution has yet been presented according to which eye-tracking data is processed optimally. On the contrary, with only very few exceptions, today's eye tracking interfaces are each tailored for one specific task only. Thus, any processing of eye tracking data in respect of a first application cannot be reused by a second application, and vice versa. Hence, if multiple eye-controlled applications are used in a single computer, one particular eye-tracking data processing unit is typically required for each application. Naturally, in such a case, there is a high risk that the different applications perform a substantial amount of overlapping eye-tracking data processing. Moreover, each designer of eye-controllable applications needs to have expertise in both eye tracking technology and the interpretation of eye-tracking data in order to extract the data required by the application.