In current eye tracking systems camera images are processed to determine the position and/or gaze direction of an eye or the eyes. This can be done by detecting features in the image. Features are, for example, pupils and corneal reflexes.
For each feature, typically size, contour and position are extracted. Subsequent processing calculates gaze vector and gaze point from these features.
Camera based remote eye trackers provide a large working range for head movement. With a fixed focal length of the camera lens and without joints to change camera orientation, the working range has to be covered completely by the camera sensor.
Due to limited bandwidth of the camera bus, the frame rate of the system depends on size and spatial resolution of the grabbed image. Covering the full working range within a single image at full spatial resolution allows a minimal frame rate only, due to the large amount of data of full spatial resolution.
However, miscellaneous eye tracking applications require high sampling rates that cannot be provided using permanent full spatial resolution image acquisition.
This problem is addressed with ROI (region of interest) based image acquisition which allows higher frame rates compared to full spatial resolution image acquisition because the ROI covers a fractional part of the sensor only. The ROI size is set in a way that the ROI image covers all features needed for eye tracking and the sampling rate during ROI image acquisition fulfils the system requirements regarding temporal resolution.
Patent EP 1 562 469 B1 describes a solution that uses full frame based image acquisition to initially detect the position of the eye or eyes on the camera sensor. It then creates an ROI around the eye or eyes to track the eye within this region only. This results in reduced bandwidth requirements and can thus be used to increase camera readout frame rate. However, this speed-up does not apply to the initial read out of the full frame. Therefore, the minimum time that is required to find the eyes is determined largely by the time it takes to read out the full frame.
To find the initial ROI position, a system starts in Head Position Search Mode, wherein generally a full spatial resolution based image acquisition is used. “Head Position Search Mode” refers to a mode in which the acquired image is searched for one or more features which represent the eyes or are indicative of the head position or the eye position. As soon as the head or the eyes are detected, the system continues acquiring ROI images of an ROI around the eye or eyes. ROI gets positioned where it covers the detected eye or eyes.
The system then continues in Tracking Mode wherein features are extracted from the ROI image and used for further processing. The ROI position is moved according to head movements, depending on feature positions. If tracking of visible eye features fails, the system returns to the head position search mode.
However, various problems continue to exist. Using full spatial resolution based image acquisition for head position search forces the system to operate temporarily at a lower frame rate than required by the system. On the other hand, such a low full-frame speed of the camera determines the time between the subject being in front of system (ready to track) and the actual start of the tracking. Increasing this pick-up speed and reducing the system latency is highly desirable.
Moreover, switching between full spatial resolution based image acquisition and ROI based image acquisition causes a delay in camera operation with many common camera models which results in a decreased data rate.
A final issue is that changes in data frame rate complicate the subsequent data analysis due to frame rate dependent parameters.
It is therefore an object of the present invention to avoid these problems related to full spatial resolution based image acquisition, and in particular to reduce the time to find the initial ROI position.