The present invention relates to a contactless method for finding and subsequent tracking of the 3D coordinates of a pair of eyes in at least one face in real time.
In contrast for example to contact methods, contactless methods for finding and tracking faces do not require any additional means, such as head-mounted cameras or spots. The advantage of these contactless methods is that the freedom of movement of the subjects to be tracked is not restricted in any way by physical means and that the subjects are not bothered by the use of such means.
Contactless detection and tracking methods are known in the prior art. U.S. Pat. No. 6,539,100 B1 and EP 0 350 957 B1, for example, disclose how the viewing direction of an observer is detected with the help of certain face and eye characteristics which are extracted from the recorded images. While U.S. Pat. No. 6,539,100 B1 describes a method that serves to find out which object is being viewed by an observer, EP 0 350 957 B1 additionally has the target to track the movement of the eyes over a certain period of time.
DE 197 31 303 A1 discloses a method and device for contactless, headgear-less measurement of the viewing direction of eyes even where head and eye movements take place at a fast pace and in a large range. The eye is illuminated with infrared light, imaged by an optical system and recorded by at least one image sensor; and the thus generated image is subsequently processed by a viewing direction processor which can be configured by a main processor to determine the viewing direction by finding the position of the centre of the eye pupil and by determining the corneal reflections, and is then displayed on a monitor.
WO 03/079 902 A1 also describes a method of contactless detection and tracking of eyes at various lighting conditions in real time. The eyes are detected by executing the following steps: recording of two actively illuminated images, where one image represents the ‘bright pupil effect’ and the other image represents the ‘dark pupil effect’ of the eyes; creation of a differential image of these two images, where the resulting differential image only shows contrast at those positions where the contrast of the two images differs; marking out the contrast points in the differential image as possible eyes; and comparison of the possible eyes with pre-recorded images of eyes and non-eyes which serve as reference images in order to be able to distinguish eyes from non-eyes in the differential image with a high probability. The eyes are then tracked in an image that follows the detection by applying a Kalman filter and comparing the expected eye positions with eye positions that are actually detected in the differential image. If the comparison does not produce any results, the position of the eyes is determined in a further step with the help of a clustering algorithm, which clusters the possible eye positions based on their intensities in the image and which compares these clusters with the expected position.
This prior art method exhibits a number of disadvantages. On the one hand, the process of detecting and tracking the eyes takes advantage of an image which is created based on one image with ‘bright pupil effect’ and one with ‘dark pupil effect’ using an interlaced scanning method, where the two images are not recorded simultaneously by one image sensor, but one after another. A temporally non-coincident image recording in conjunction with a superposition of the images by the interlaced scanning method, which serves to reduce the amount of image data for transmission, does not allow a reliable detection and tracking of the eyes in real time. On the other hand, this method only allows to detect and to track eyes which are spatially very close to the image sensor, because the effects caused by the active illumination diminish as the distance of the eyes to the illumination source grows, which leads to the effect that the eyes to be detected can no longer be distinguished from other objects or a noise in the differential image.
WO2007/019842 tries to counteract these disadvantages in that the eye positions are found using a hierarchically organised routine, where the amount of data to be processed is gradually trimmed down, starting with the amount of data of the total video frame (VF) and proceeding to a target face region (GZ) and finally to a target eye region (AZ). In addition, each instance or group of instances is always executed on a dedicated computing unit, so that they run in parallel. However, WO2007/019842 does not explain how the eyes are found and tracked.
However, real-time detection and tracking of eyes is a decisive factor in human-machine interaction. It is thus particularly desired to provide methods for detecting and tracking eyes which make precise real-time finding and tracking of eyes possible.
Precise and efficient determining of the position also in the Z direction is necessary in particular in the context of dynamic applications, where large and fast movements of the faces in all spatial directions are possible. Such dynamic applications include for example autostereoscopic or holographic displays where the desired image impression will only occur if the eye positions of the observers are determined precisely both spatially and temporally, so that the autostereoscopic or holographic image information can be directed at the actual eye position. In contrast, in the stationary applications which are known in the prior art, such as devices for monitoring of drivers and pilots, the detection and tracking range is rather small, since in those applications the range of movement of the subjects is typically restricted to a minimum in all spatial directions.
The methods known in the prior art further exhibit the problem that the position information of the eyes cannot be delivered in real time, in particular not where more than one face is to be identified and tracked.