1. Field of the Invention
The present invention relates to a method and a circuit arrangement for a contactless detection and tracking of eye positions or pupils of multiple observers in real-time mode. The input data comprises image material in the form of a sequence of digital video frames which are acquired by one or multiple image sensors.
Reference points of the eyes of multiple observers can be determined without the need for any additional auxiliary means such as glasses, headgear or spots.
In contrast to stationary applications, for example the monitoring of drivers or pilots, where the range of motion, and in particular the depth range is very limited and thus almost stationary, this invention serves to detect the eye positions in a large target region, it copes with quick observer movements, and it determines the depth coordinate in a relatively large range, e.g. between 0.5 and 3.5 m.
The efficient and precise real-time realisation of the eye detection is a major human-machine-interface. A major field of application of the invention is a device for detecting and tracking eye positions of users of autostereoscopic displays. Such displays provide the observers with a stereoscopic image impression without the need for any auxiliary means, such as polarisation glasses. Further applications of the invention comprise for example the video holography and implementations in the area of the detection of persons, faces or viewing directions.
Autostereoscopic displays, where the presentation is tracked by means of a so-called tracking device, provide multiple observers with a great mobility in a large visibility region. The error-free detection and tracking of eyes, eye positions or pupils is an important human-machine-interface in these fields of image representation, too.
A tracking device which works reliably and error-free is usually not noticed by an observer. In many applications, however, errors of the tracking system cause undesired side-effects, which, for example, in the field of 3D applications, cause faulty reconstruction or crosstalk. A tracking device is required to have great precision, reliability and accuracy. The system must also be adequately efficient and precise in order to be able to track correctly all major movements and so allow the observer to move as freely as possible in all three dimensions.
2. Description of Related Art
Several types of contactless tracking systems are commercially available. Simple models usually feature a basic application software for standard operating systems and have standardised hardware and software interfaces.
Document WO 03/079 902 A1, “Real-time eye detection and tracking under various light conditions”, Zhiwei Zhu Qiang Ji, describes a method for contactless real-time eye detection which comprises mainly an eye position detection step and an eye tracking step. The eye position detection step includes a combination of the method of active illumination and a pattern recognition. After the eyes of an observer have been detected for the first time the tracking of the eyes is carried out, the latter step comprising the combination and synthesis of several algorithms and techniques. Despite the combination and synthesis of several means, there is still the problem that major and abrupt movements of the head in all three dimensions cannot be tracked in real-time and that a real-time processing may be prevented due to the delay between the provision of the position data and the image acquisition. This applies in particular to the detection of the eye position in the depth dimension at unfavourable ambient conditions.
In a vehicle, for example, the driver's face is always situated within a predictable distance to the instrument panel. Moreover, there are only small variations of the movements in vertical and horizontal direction. In particular, the real range of motion in the depth dimension is very small, so that usually the depth position can be extrapolated with sufficient precision even if only one camera is used.
The object of the present invention is to provide a large range of motion in all three dimensions of a viewing space while offering short computing times. In contrast to the mentioned prior art, it is necessary to detect the eyes in all three dimensions, that is including the depth dimension. The depth range shall preferably comprise a large range from 0.5 to at least 3.5 metres. For determining the depth, on the one hand a multitude of independently arranged cameras is required for being able to take images of the target region from several perspectives. Moreover, the detection of the eyes at a distance of up to several metres requires the cameras to have a great resolution, which results in a large amount of data per camera and per video frame.
The problem of real-time processing of a large amount of data becomes graver when there are several observers to be detected. In particular, very computation-intensive process steps are required in order to be able to detect observers which are difficult to distinguish due to illumination effects, reflections or eyeglass lenses. Experience shows that the detection of a third or fourth person who is partly concealed or who stands a little aside can often only be achieved with an extensive, time-consuming computational effort. However, the required computational effort for the observer who is momentarily least easily detectable and who is only detectable with great effort must not adversely affect the real-time tracking of the other observers.
Problems with the detection of eye positions lead to the fact that the input video frames may not permanently be processed in the real-time mode any more. A maximum acceptable computing time per person and per frame may be exceeded if eyeglass lenses or earpieces cover the eyes, or if an observer turns away from the cameras abruptly, but only for a moment.
Being aware of the disadvantages of the prior art, it is an object of the present invention to provide a method which allows to detect the eye positions of multiple observers in real time even if the observer(s) move their heads significantly, abruptly and in all three dimensions. The method shall detect the eye positions in a large target region, shall compensate abrupt movements of the observers and shall determine the depth coordinate in a large range. Moreover, while minimising the amount of errors, the response time between the image acquisition, that is the reading of a video frame, and the output of a result, that is the providing of the eye positions, shall be sustainedly reduced. Furthermore, the method shall allow to achieve error-free results in the real-time mode also if high-resolution cameras are used.