Face tracking is a recent innovation in digital cameras and related consumer imaging devices such as camera phones and handheld video cameras. Face tracking technologies have been improving to where they can detect and track faces at up to 60 fps (see, e.g., U.S. Pat. Nos. 7,403,643, 7,460,695, 7,315,631, 7,460,694, and 7,469,055, and US publications 2009/0208056, 2008/0267461 and U.S. Ser. No. 12/063,089, which are all assigned to the same assignee and are incorporated by reference). Users have now come to expect high levels of performance from such in-camera technology.
Faces are initially detected using a face detector, which may use a technique such as that described by Viola-Jones which use rectangular Haar classifiers (see, e.g., US2002/0102024 and Jones, M and Viola, P., “Fast multi-view face detection,” Mitsubishi Electric Research Laboratories, 2003.
Once a face is detected, its location is recorded and a localized region around that face is scanned by a face detector in the next frame. Thus, once a face is initially detected, it can be accurately tracked from frame to frame without a need to run a face detector across the entire image. The “located” face is said to be “locked” by the face tracker. Note that it is still desirable to scan the entire image or at least selected portions of the image with a face detector as a background task in order to locate new faces entering the field of view of the camera. However, even when a “face-lock” has been achieved, the localized search with a face detector may return a negative result even though the face is still within the detection region. This can happen because the face has been turned into a non-frontal, or profile pose, facing instead either too much up, down, left or right to be detected. That is, a typical face detector can only accurately detect faces in a semi-frontal pose. Face lock may also be lost due to sudden changes in illumination conditions, e.g. backlighting of a face as it passes in front of a source of illumination, among other possibilities such as facial distortions and occlusions by other persons or objects in a scene.