The present invention relates generally to pattern recognition systems, and more specifically the invention pertains to a machine of locating human faces in video scenes with random content, and capable of recognizing the faces that it locates.
Personal access control systems have been implemented using visual recognition for identification of individuals. Visual recognition systems use characteristic portions of the human body for identification purposes. Typical of this type of access control are fingerprint recognition systems and facial feature recognition systems. One such system is described in U.S. Pat. No. 4,109,237, entitled "Apparatus and Method for Identifying individuals through the Retinal Vasculature Patterns", issued Aug. 22, 1978. This system uses a method of scanning the individual's eye with a light source arranged in a selected pattern and detecting that portion of the light source pattern which is reflected from the person's retina, thereby locating each intercept of the light source pattern with a blood vessel. The intercept pattern thus obtained is then compared with stored intercept patterns previously obtained from individuals who are cleared for access. Personal access control systems using visual recognition alone demand an even higher level of technical effort and sophistication than acoustical recognition systems.
The problem with identification systems like the one described above is that it requires a cooperative individual who is willing to subject himself to a retinal scan. The task of providing a system which locates human faces in video scenes with random content, and then identifies the faces it locates is alleviated to some extent, by the systems disclosed in the following U.S. Patents, the disclosures of which are specifically incorporated herein by reference:
U.S. Pat. No. 4,449,189 issued Feix; PA0 U.S. Pat. No. 3,999,006 issued Takeuchi et al; PA0 U.S. Pat. No. 4,773,024 issued Faggin et al; and PA0 U.S. Pat. No. 4,625,329 issued Ishikawa.
Perhaps the most significant of the above-cited references is the Feix patent. The Feix reference discloses a personal access control system which uses speech and face recognition. It relies on both a voice signature as well as the identification of the facial features of the speaker.
The face recognition system of the above-cited Feix system resembles earlier personal identification systems in that it requires a cooperative person who participates in his identification through a combination of both speech and face recognition. The voice signature of interrogated person uttering a key word into a microphone is compared in a pattern matcher with the previously stored voice signature of a known person uttering the same key word to obtain a first similarity score. At the same time, when a key event in the utterance of the key word by the interrogated person occurs, a momentary image of that person's mouth region onto which a grid pattern has been projected is optically recorded and compared with the previously stored corresponding momentary image of the same known person to obtain a second similarity score. The two similarity scores are analyzed to verify that the identity of the interrogated person is that of the known person.
A Face Recognition Machine (FRM) was developed at the Air Force Institute of Technology (AFIT) by Robert Russel in 1985. The FRM was based on Cortical Thought Theory (CTT) which proposes a new model of how a human brain processes information. Richard Routh developed and presented CTT as a doctoral dissertation at AFIT in 1985. CTT proposes that information is displayed as a two dimensional image on the brain. The brain then extracts the essential information (the essence of the image) as a two-dimensional vector, called a "gestalt". The gestalt is the only information that is passed to the higher levels of the brain for processing according to Mr. Routh in his article entitled "Cortical Thought Theory: A Working Model of the Human Gestalt Mechanism." PhD Dissertation, AFIT/DS/EE/85-1, Air Force Institute of Technology, DTIC Document, July 1985. Here, the FRM reduces facial images to gestalts and then compares the gestalt to a data-base in an attempt to recognize the face.
In 1986 an AFIT student added automatic face location and windowing algorithms to the FRM to eliminate human influence on the recognition process (Smith, 1986). The face locator was slow and recognition was less accurate because only the internal features of the face (eyes, nose, mouth) were used, but the question this student was trying to answer was, "Can a machine, entirely on its own, determine whether or not a persons'face is in a picture and if so, can it determine to whom the face belongs? "The answer is "yes" and the result of the student's thesis effort became the Autonomous Face Recognition Machine (AFRM) as developed by Edward J. Smith, and as documented in his thesis entitled "Development of an Autonomous Face Recognition Machine." Masters Thesis, AFIT/GE ENG/86D-36, Air Force Institute of Technology, DTIC Document, December 1986.
The present invention began after an evaluation of the AFRM location and windowing algorithms with the goal of improving recognition score and speed. Both the score and speed were reduced with the addition of the autonomous scene analysis (location and windowing) algorithms in 1986, and human influence was eliminated. The goal of this effort was to reduce the 5 to 30 minute scene analysis time as much as possible while bringing the recognition score back up to a least what was possible when human influence was allowed.
Improvement of the windowing algorithms should improve overall recognition accuracy. There are several windows on the facial scene that will be tested as possible replacements for the windows that have little affect on the recognition score. Going back to a whole-head approach used in 1985 should also improve accuracy over the internal feature approach now used. The only reason internal features are now used is the inability to separate the edges of the head from a random background. This invention investigates two possible solutions to this problem. The first is to apply an elliptical mask to a scene centered around the location of the face with a size proportional to the size of the internal features. This results in a larger area of the face being made available to the recognition algorithm. The second solution is to apply a Moving Target Indicator (MTI) algorithm to a series of input scenes prior to scene analysis. This may allow better detection of the edge of the head.
Improvement of the location algorithm may speed up the scene analysis, however the major improvement in speed will be gained by re-hosting the AFRM on a new Micro-VAX computer and its related equipment.
While identification of cooperative individuals who participate in the identification process represents a breakthrough in the art, a need remains to provide an improved system capable of locating human faces in video scenes with random content, then recognizing the faces that it locates. The present invention is intended to satisfy that need.