Face recognition includes the process of recognizing an individual by comparing the captured face image against one or more stored face images to identify a match. The stored images are usually called the gallery or watch list databases. The captured image or video is usually called the probe.
In national security or military applications, there is a need for nighttime personnel target identification. At night or in darkness (without illumination), a visible face image (produced using visible light) is of limited use for identification. Thermal face images are found to be more useful at night since the thermal face images can be acquired by thermal sensors without external illumination. The thermal camera measures the heat (temperature) that is emitted (radiated) from the human face. However, most watch list databases contain visible imagery. For a given picture, visible sensors measure the light reflected from the facial surface at a single observation angle. The thermal face images and the visible face images are very different in appearance; but there is a need for cross modality face recognition, i.e., thermal-to-visible face recognition. Thermal-to-visible face recognition would include the identification of a person in a thermal image by comparing the person's thermal facial image to many visible facial images in a database or watch list.
Another challenge in facial recognition occurs during adverse conditions, such as during night time surveillance when frequently the images of the subject are not in a still or frontal position, but instead consist of multiple 3-D head rotations (pose angles). Different pose angles and different illumination conditions can result in different images. For example, the face image of a person in a side view is very different from the image of the same person in a frontal view. It is beneficial to consider the pose changes in face recognition.
The research on thermal face recognition is summarized in (1) C. S. Martin et. al., “Recent advances on face recognition using thermal infrared images,” Reviews, Refinements and New Ideas in Face Recognition, InTech Open Access Publisher, edited by Peter M. Corcoran, Chapter 5, pp. 95-112, July 2011; (2) M. K. Bhowmik, et. al., “Thermal infrared face recognition—a biometric identification technique for robust security system,” Reviews, Refinements and New Ideas in Face Recognition, InTech Open Acess Publisher, edited by Peter M. Corcoran, Chapter 6, pp. 114-138, July 2011 ; (3) L. B. Wolff , D. A. Socolinsky, and C. K. Eveland, “Face recognition in the thermal infrared,” Computer Vision Beyond the Visible Spectrum Book, pp. 167-191, Springer London, 2006). Thermal imagery has less texture information. However, thermal imagery is less sensitive to illumination variation and expression changes. It has certain applications for nighttime personnel identification that is crucial to national security and military operations. U.S. Pat. No. 7,406,184, to L. B. Wolf, entitled “Method and apparatus for using thermal infrared for face recognition,” Jul. 29, 2008, discloses a method of incorporating the use of thermal face imagery into an end-to-end face recognition system to mitigate the influence of varying ambient illumination on systems using visible imagery. In this method, both visible face images and thermal face images are used to create a face representation template that is matched or compared with a stored database or gallery of face templates. The system requires both visible imagery and thermal imagery of an individual (wanted individual) in the probe. For a nighttime operation especially without an external illumination, this condition is not easy to meet.
Cross-modality face recognition research, in which the gallery contains images in one modality (e.g., visible) and the probe contains images in another modality (e.g., thermal), generally work using the existing techniques that were designed for visible face recognition (such as using a variation of edge information). U.S. Pat. No. 7,512,255, entitled “Multi-modal face recognition,” discloses a method to identify an individual using visible and infrared images using a sequence of multimodal data (3D from multiple visible light cameras, 2D infrared) and employing an Annotated Face Model (AFM), which is a 3D segmented facial model.
Generally, there are three common approaches to the problem of pose changes in facial recognition using two dimensional data. First, the training set contains many face images with different poses, which requires multiple datasets. An extension of this type of approach is to train each pose-specific face classifier with multiple examples of that pose. Then, the output of these classifiers is fused to give a face recognizer that can process a wider range of facial poses (for example, see U.S. Pat. No. 7,542,592 to Huang, Pose-invariant Face Recognition System and Process”). Second, for each input image, the estimate of the pose is calculated and then the input image is normalized to a virtual frontal view pose before it is matched against the gallery (see for example, U.S. Pub. Pat. App. No. 2010/0284577, “Pose-variant face recognition using multiscale local descriptors,” Nov. 11, 2010).
A third approach is to use multiple 2D images to estimate a 3D face model. The 3D geometry information can be used for pose correction (see U.S. Pub. Pat, Application No. 2010/0149177, “Generation of normalized 2D imagery and IS systems via 2D to 3D lifting of multifeatured objects,” Jun. 17, 2010, M. I. Miller).
When using 3D sensor data, a 3D face model can be established, which requires a high-resolution 3D sensor (see U.S. Pub. Pat. Application No. 2006/0120571 A1, “System and method for passive face recognition,” Jun. 8, 2006, P. H. Tu, et. al).
Visible sensor measures the reflectance of light from the object. Infrared sensor measures the heat transmitted from the object. Face images that are acquired from visible sensor and infrared sensor represent different face phenomena. Pose rotations of the human head can produce significant changes and distortions in the facial appearance of a person in the face image. Such effects cause degradation of face recognition performance, resulting in possible performance degradation.
Sensors of different modalities measure different properties of the object. Visible light and thermal infrared are in different spectrum. Therefore, visible sensors and thermal sensors capture different physical properties of the human face. Because of these physical variations, they are translated into different features.
One example is that edges are not aligned to each other in thermal and visible images. Human visual system and man-made algorithms do use edge as the primary information for recognition purposes. However, if e edge information is used to match thermal and visible face images, they do not quite match.
Other examples are that when imagery of a person is acquired from different sensors at different time, they do not have the same 3D orientation angles. This is a very common phenomenon. In a practical situation, the camera might not be exactly looking straightly at the face when the image is acquired. If one wants to match face images with different poses, one needs to address the 3D transformation among them first.
There exists a need to identify a personnel target when the probe image and gallery image are in different modalities. There exists a need to utilize common structures. There also exists a need to (biometric landmarks) characterize both images from gallery and probe, e.g., visible and thermal images.
It is known to use biometric landmarks such as the eyes and mouth to capture the predominate identifying features of an image. For example, Intel® Perceptual Computing SDK—How to use the Face Detaction Software, https://software.intel.com/en-us, Oct. 30, 2012, discloses The Intel® Perceptual Computing SDK, which is a library of pattern detection and recognition algorithm implementations exposed through standardized interfaces. The SDK provides a suite of face analysis algorithms including face location detection, landmark detection, face recognition and face attribute detection. The Intel face detection algorithm locates the rectangle position of a face or multiple faces from an image or a video sequence in real-time capture or playback mode. The detection algorithm locates the 6 point or 7 point landmarks namely, the outer and inner corners of the eyes, the tip of the noise, and the outer corners of the mouth.
Since the face images do not have the same 3D orientation angles when they are acquired from different sensors and/or at different times, there is a need to develop a new 3D registration method via a single frame for face images having 3D pose angles.
It is difficult for a machine to search facial photos for a match in part because when a person's photo appears in a non-standard position, the orientation of the face is different; i.e., the faces are generally pointed at an angle to the vertical plane of the camera. Because the orientation of the face is different, the coordinates of the facial features do not match those of a pose position in which the subject's head is upright and aligned with the camera (commonly referred to as a mug shot).