Vision systems frequently entail detecting and tracking a person's eyes in a stream of images generated by a video camera. In the motor vehicle environment, for example, a camera can be used to generate an image of the driver's face, and portions of the image corresponding to the driver's eyes can be analyzed to assess drive gaze or drowsiness. See, for example, the U.S. Pat. Nos. 5,795,306; 5,878,156; 5,926,251; 6,097,295; 6,130,617; 6,243,015; 6,304,187; and 6,571,002, incorporated herein by reference.
While eye detection and tracking algorithms can work reasonably well in a controlled environment, they tend to perform poorly under real world imaging conditions where the lighting produces shadows and the person's eyes can be occluded by eyeglasses, sunglasses or makeup. As a result, pixel clusters associated with the eyes tend to be grouped together with non-eye features and discarded when subjected to appearance-based testing. This problem occurs both in eye detection routines that initially locate the eyes, and in eye tracking routines that track the eye from one image frame to the next. Problems that especially plague eye tracking include head movement and eye blinking, both of which can cause previously detected eyes to suddenly disappear. The usual approach in such cases is to abandon the tracking routine and re-initialize the eye detection routine, which of course places a heavy processing burden on the system and slows the system response. Accordingly, what is needed is an efficient method of reliably tracking a person's eyes between successively produced video image frames, even in situations where the person's head turns or the eyes momentarily close.