A computer user typically retrieves processed information on the visual level by watching a screen or a display device. In recent years the graphical complexity of displayed information has significantly increased allowing the user to observe simultaneously a multitude of images, text, graphics, interaction areas, animations and videos in a single displayed image. This diversity is preferably utilized in web pages, which have become a significant communication medium through the gaining global influence of the internet.
Web pages and other visual compositions or scenarios designed for computer assisted display intend to exceed the viewable area of screens and display devices. As a result, scrolling features are added to virtually move the viewable area over a larger display scenario.
Visual compositions are created for many purposes and have to fulfill expected functions like for instance informing, advertising or entertaining. The multitude of available design elements and their possible combinations make it necessary to analyze the display scenarios for their quality and efficiency. A common technique to provide the necessary information for this analysis is to track eye movements.
A number of eye tracking devices are available that track the eye movement and other elementary eye behaviors. Their precision is such that dot like target points corresponding to a center of an observers gazing area can be allocated on the display device. The eye tracker generates a continuous stream of spatio-temporal data representative of eye gaze positions, at sequential moments in time. Analysis of this raw data typically reveals a series of eye fixations separated by sudden jumps between fixations, called saccades.
The human eye recognizes larger objects as for instance a virtual page by scanning it in a number of fixations. The scanning rate ranges typically between 2 and 5 fixations per second. The time an observer needs to view a virtual page and consequently the number of fixations depend mainly on the number of details and the complexity of information and text in the virtual page or the display scenario.
A plot of all fixations that are tracked and correlated to a displayed virtual page typically shows arhythmically placed dots with highly differing densities. An informative survey of the current state of the art in the eyetracking field is given in Jacob, R. J. K., “Eye tracking in advanced interface design”, in W. Barfield and T. Furness (eds.), Advanced interface design and virtual environments, Oxford University Press, Oxford, 1995. In this article, Jacob describes techniques for recognizing fixations and saccades from the raw eye tracker data.
An interpretation engine developed by the current inventor identifies elementary features of eye tracker data, such as fixations, saccades, and smooth pursuit motion. The interpretation engine also recognizes the elementary features of a plurality of eye-movement patterns, i.e., specific spatio-temporal patterns of fixations, saccades, and/or other elementary features derived from eye tracker data. Each eye-movement pattern is recognized by comparing the elementary features with a predetermined eye-movement pattern template. A given eye-movement pattern is recognized if the features satisfy a set of criteria associated with the template for that eye-movement pattern. The method further includes the step of recognizing from the eye-movement patterns a plurality of eye-behavior patterns corresponding to the mental states of the observer.
The eye interpretation engine provides numerous pieces of information about eye behavior patterns and mental states that need to be graphically presented together with the correlated screen, display scenario, or a virtual page. The current invention addresses this need.
Eye tracking analysis programs need to refer or reconstruct the original display scenario in order to assign the stored eye tracking data correctly. Two general approaches are known in the prior art to address this need:
1. Video-based eye-tracking output: A videotape is taken during a recording session where the test person is confronted with the display event or virtual pages that need to be analyzed. The videotape is usually taken from the test person's view by using a head-mounted scene camera that records the display events simultaneously with an eye-tracking camera that records eye movements. Typical eye-analysis software programs analyze in a consecutive processing operation the raw eye-tracking data and superimpose an indicator on the video corresponding to the test person's gaze location over the image taken by the scene camera. As a result, a videotape shows the display events during the recording session with a superimposed indicator. The researcher can then watch the videotape in order to see the objects the test person looked at during the recording session. The problem with a video movie of the display events with a dancing indicator is that the visual analysis process is very time consuming such that eye-tracking studies are typically constrained to testing sessions lasting only a few minutes. For demographically or statistically representative studies with a number of test persons this technique is highly unpractical.
2. Reconstruction of the original environment: A second approach to associate the eye-movement data with a displayed scenario is to reconstruct the display event of the recording session and display it with superimposed graphical vocabulary that is associated with the eye tracking data. Reconstructing the display event is only possible for simple static scenarios. Virtual pages like web pages that involve scrolling, or other window based application scenarios cannot be reconstructed with the correct timing and the recorded eye-tracking data cannot be associated properly. Web pages have in general a highly unpredictable dynamic behavior, which is caused by their use of kinetic elements like videos or animation. Their unpredictability is also caused by down loading discrepancies dependent on the quality of the modem connection and web page contents.
Therefore, there exists a need for a method to capture a dynamic display event in real time correlation to recorded eye-tracking data. The current invention addresses this need.
To view web pages a user has to operate other communication devices such as a keyboard or a mouse to perform zooming or scrolling of the virtual page. For window based application scenarios mouse and keyboard are used to open, close and manipulate windows, pop up menus and to perform other functions as they are known for computer operation. In order to associate the display events in real time with the correlated eye-tracking data it is necessary simultaneously record all communication device interactions of the test person during the recording session. The current invention addresses this need.
U.S. Pat. No. 5,831,594 discloses a method and apparatus for eyetrack derived backtrack to assist a computer user to find the last gaze position prior to an interruption of the eye contact. The invention scrolls a virtual page and highlights the last entity of a virtual page that had the last fixation immediately prior to the interruption. The invention does not interpret eye tracking data, it only takes one piece of fixation information to trigger the highlighting function, which operates to assign a virtual mark assigned to the last entity. The invention does not present any qualitative information or comparative interpretations.
U.S. Pat. No. 5,898,423 discloses a method and apparatus for eyetrack-driven captioning, whereby a singular mental state of interest is determined to trigger a simultaneous presentation of additional information. The invention does not present any qualitative information or comparative interpretation.
The Web page www.eyetracking.com describes a method to allocate areas of interests of an observer by either superimposing fixations and saccades onto the analyzed display scenario (ADP) or by opposing the ADP to a corresponding spectral colored area graph. The density of the superimposed fixations i.e. the colors of the area graph are thought to represent attention levels. The described method does not present any qualitative information or comparative interpretations and can be applied only to reproducible display events consisting of a number of static scenarios.
The Web page www.smi.de describes a method to allocate areas of interests of an observer by superimposing graphical symbols onto the ADP. The graphical symbols are assigned to fixations and are scaled correspondingly to the density or duration of the fixations. The individual graphical symbols are connected with each other to visualize the fixation chronology. The described method does not present any qualitative information or comparative interpretation about the utilized eye-tracking data and can be applied only to reproducible display events consisting of a number of static scenarios.