Humans have a remarkable ability to identify faces in a rapid and seemingly effortless fashion. It develops over several years of childhood and results in the intelligence to recognize thousands of faces throughout our lifetime. This skill is quite robust, and allows us to correctly identify others despite changes in appearance, like aging, hairstyle, facial hair, and expression. It is also unaffected by the face orientation and lighting conditions.
For decades, building an automatic electronics system to duplicate human face identification capability has been a fascinating goal for many academic researchers and commercial companies around the world. Various attempts in the past were hampered by a lack of appropriate image acquisition means, efficient face identification algorithms with required accuracy, and computation power that implements these algorithm in real-time. To date, existing face identification systems have not been as successful or widely applied as would be desired.
Fundamentally, the human face is a three-dimensional (three-dimensional) object, and each face has its unique three-dimensional geometric profile. Almost all existing face identification systems, however, use only two-dimensional face images as their input. The two-dimensional facial images are inherently vulnerable to changes in light condition and face orientation. Facial recognition techniques based on two-dimensional images are also not robust in dealing with varied facial expressions.
Thus, some limitations of the existing two-dimensional face identification techniques include: (1) vulnerability to changes in face orientation (<±15°); (2) vulnerability to changes in illumination condition; (3) vulnerability to changes in facial expressions; (4) requires cooperative subjects, otherwise the face image acquired may be off-angle. Each of these factors decrease the accuracy of matching an input face against a face database.
These fundamental restrictions prevent current face identification systems from effectively and reliably performing face recognition in field-deployable conditions. As a result, the successful match-rate for existing face identification systems in real-world applications is typically very low (below 90%).
The typical two-dimensional recognition systems include a suite of software that compares two-dimensional surveillance pictures with a database of two-dimensional facial images and ranks matches between surveillance pictures and database images based on a scoring system. The theory is that the higher the score of two-dimensional image matching, the greater the probability that there is a ‘match’ of the human subject.
Although such systems use different approaches to sorting faces and narrowing the possible matches, they all rely on being able to match key facial features with baseline images stored in a face image database. And, although such systems can map and identify more than one hundred features on each face, with fewer than 20 feature matches a successful match is highly unlikely.
Traditional two-dimensional face recognition systems often claim relatively high accuracy rates (in excess of 95%), but these rates are achieved under very controlled conditions. Only if both the database and surveillance images are taken from the same straight-on angle and with consistent lighting and facial expression, is such accuracy possible. If the image captured by a surveillance camera has an angle from the side, above or below the subject, or if the lighting conditions are significantly different from the database pictures, accuracy rates drop dramatically.
These limitations on the orientation and illumination mean that the use of facial recognition must be limited to access control points where a cooperative subject is standing still, facing the camera, and lighting is controlled. Furthermore, the matching program is looking for known suspects. If an individual has not yet been identified as a suspected person or if the existing photos of their face are not straight on or under good lighting conditions, then the probability of finding a match drops significantly.
A series of recent studies carried out by U.S. Army, Department of Justice and the National Institute of Standards and Technology (NIST) suggest that using three-dimensional face shape features in a face identification system could potentially increase matching accuracy and recognition speed. However, the approaches considered in these studies still could not solve the deterioration of performance under changes in facial orientation, lighting conditions, and facial expression.
The facial images captured by real-world surveillance cameras are usually not in fore-frontal orientation (i.e., straight on) and are usually not captured in evenly illuminated conditions. Most of them have quite large side-view and/or top-view angles, and lighting sources are usually from ceiling or sideways thus an evenly illuminated facial image is hard to get. Additionally, the expression of the human face varies constantly. Comparing facial images capture at an off-angle and in poor lighting with facial images taken fore-frontally in well lit conditions, (i.e., images in a database) would certainly result in a quite high recognition error rate.
Attempts have been made by researchers to store images of the same subject captured from multiple viewing perspectives. The practical dilemma of this approach is that collecting multiple images of the same subject is a lengthy and costly operation. Furthermore, it is difficult to collect multiple images to cover the possible range of side-view and top-view angles and various lighting conditions.