Face detection and tracking in real-time is well known in image processing, for example as described in European Patent No. EP2052347 (Ref: FN-143). These techniques enable one or more face regions within a scene being imaged to be readily delineated and to allow for subsequent image processing based on this information.
Such image processing can include face recognition which attempts to identify individuals being imaged; auto-focussing by bringing a detected and/or selected face region into focus; or defect detection and/or correction of the face region(s).
Concerning individual identification based on face features, A. K. Jain, A. Ross, and S. Prabhakar, “An introduction to biometric recognition,” IEEE Trans. Circuits Syst. Video Technol., vol. 14, 2004 discloses that the iris of the eye is a near-ideal biometric. Typically, an image of an iris is best acquired in a dedicated imaging system that uses infra-red (IR) illumination, typically near infra-red (NIR) above 700 nm.
The iris regions are typically extracted from identified eye regions and a more detailed analysis may be performed to confirm if a valid iris pattern is detectable. For example, J. Daugman, “New methods in iris recognition,” IEEE Trans. Syst. Man. Cybern. B. Cybern., vol. 37, pp. 1167-1175, 2007 discloses a range of additional refinements which can be utilized to determine the exact shape of iris and the eye-pupil. It is also common practice to transform the iris from a polar to rectangular co-ordinate system, although this is not necessary.
Detecting and tracking eyes or iris regions can also be used for determining gaze or a person's condition, such as fatigue or other health condition, which is especially useful in driver monitoring systems (DMS) integrated in vehicles.
Separately, most cameras and smartphones can identify specific patterns, such as ‘eye-blink’ and ‘smile’ in real-time tracked faces, and the timing of main image acquisition can be adjusted to ensure subjects within a scene are in-focus, not blinking or are smiling such as disclosed in WO2007/106117 (Ref: FN-149).
A common problem when capturing images within a scene is limited system dynamic range when acquiring differently illuminated subjects. In particular, regions of acquired images corresponding to bright regions of a scene tend to be overexposed, while regions of acquired images corresponding to dark regions of a scene tend to be underexposed.
This problem can particularly effect the acquisition with active illumination of faces within a scene extending over a significant depth of field within the scene, such as faces of occupants disposed at different rows within a vehicle being imaged from a camera located towards the front of the vehicle, for example, near a rear-view mirror. In particular, if the exposure is set for acquiring properly exposed images of faces near to the camera (which are more illuminated by a light source), the acquired images of faces distant from the camera (which are less illuminated by the light source) tend to be underexposed. Vice versa, if the exposure is set for acquiring properly exposed images of the distant faces, the images of the nearer faces tend to be overexposed.
A known solution to acquire an image with high dynamic range (HDR) is to capture a sequence of consecutive images of the same scene, at different exposure levels, for example, by varying the exposure time at which each image is acquired, wherein shorter exposure times are used to properly capture bright scene regions and longer exposure times are used to properly capture dark scene regions. The acquired images can be then combined to create a single image, where various regions within the scene are properly exposed.
It can be readily appreciated how this solution can be quite satisfactorily applied to scenes with static subjects, such as landscapes, while being impractical for capturing faces which are relatively close to the camera and which can move during consecutive image acquisitions, thus causing artefacts when attempting to construct an image of the scene. It should also be noted that it is not possible to acquire such sequences of variably exposed images using rolling shutter techniques.
From “High Dynamic Range Image Sensors,” by Abbas El Gamal, Stanford University, ISSCC′02 (http://cafe.stanford.edu/˜abbas/group/papers and pub/isscc02_tutorial.pdf) it is further known to use an HDR CMOS image sensor with spatially varying pixel sensitivity. In particular, an array of neutral density (ND) filters is deposited on the image sensor so that, in a single captured image of a scene, sensor pixels associated with darker filters can be used to acquire bright regions of the scene and sensor pixels associated with lighter filters can be used to acquire dark regions of the scene. However, this document is not concerned about face acquisition and detection across an extensive depth of field within a scene, especially when using active IR illumination.