The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.
“Biometrics” refers to unique physiological and/or behavioral characteristics of a person that can be measured or identified. Example characteristics include height, weight, shape, fingerprints, retina patterns, skin and hair color, and voice patterns. Identification systems that use biometrics are becoming increasingly important security tools. Identification systems that recognize irises, voices or fingerprints have been developed and are in use. These systems provide highly reliable identification, but require special equipment to read the intended biometric (e.g., fingerprint pad, eye scanner, etc.) Because of the expense of providing special equipment for gathering these types of biometric data, facial recognition systems requiring only a simple video camera for capturing an image of a face have also been developed.
In terms of equipment costs and user-friendliness, facial recognition systems provide many advantages that other biometric identification systems cannot. For example, face recognition does not require direct contact with a user and is achievable from relatively far distances, unlike most other types of biometric techniques, e.g., fingerprint and retina pattern. In addition, face recognition may be combined with other image identification methods that use the same input images. For example, height and weight estimation based on comparison to known reference objects within the visual field may use the same image as face recognition, thereby providing more identification data without any extra equipment.
However, facial recognition systems can have large error rates. In order to provide the most reliable and accurate results, current facial recognition systems typically require a person who is to be identified to stand in a certain position with a consistent facial expression, facing a particular direction, in front of a known background and under optimal lighting conditions. Only by eliminating variations in the environment is it possible for facial recognition systems to reliably identify a person. Without these types of constraints in place, the accuracy rate of a facial recognition system is poor, and therefore facial recognition systems in use today are dedicated systems that are only used for recognition purposes under strictly controlled conditions.
Video surveillance is a common security technology that has been used for many years. The equipment (i.e., video camera) used to set up a video surveillance system is inexpensive and widely available. A video surveillance system operates in a naturalistic environment, however, where conditions are always changing and variable. A surveillance system may use multiple cameras in a variety of locations, each camera fixed at a different angle, focusing on variable backgrounds and operating under different lighting conditions. Therefore, images from surveillance systems may have various side-view and/or top-view angles taken in many widely varying lighting conditions. Additionally, the expression of the human face varies constantly. Comparing facial images captured at an off-angle and in poor lighting with facial images taken at a direct angle in well lit conditions (i.e., typical images in a reference database) results in a high recognition error rate.
In a controlled environment, such as an entry vestibule with a dedicated facial recognition security camera, the comparison of a target face to a library of authorized faces is a relatively straightforward process. An image of each of the authorized individuals will have been collected using an appropriate pose in a well lighted area. The person requesting entry to the secured facility will be instructed to stand at a certain point relative to the camera, to most closely match the environment in which the images of the authorized people were collected.
For video surveillance systems, however, requiring the target individual to pose is an unrealistic restriction. Most security systems are designed to be unobtrusive, so as not to impede the normal course of business or travel, and would quickly become unusable if each person traveling through an area were required to stop and pose. Furthermore, video surveillance systems frequently use multiple cameras to cover multiple areas and especially multiple entry points to a secure area. Thus, the target image may be obtained under various conditions, and will generally not correspond directly to the pose and orientation of the images in a library of images.
A set of identifying information extracted from one or more images is called a “feature set.” A feature set of a target image allows the target image to be compared with some or all images stored in the surveillance system. The resulting comparison between two images is called a “similarity score.” However, storing features sets and similarity scores for every image in the surveillance system requires a significant amount of resources, such as memory and CPU cycles. Therefore, many systems 1) store only a subset of all possible feature sets and 2) identify only a subset of all the events that may match a target event. Such systems are referred to as “limited knowledge systems.”
For example, a limited knowledge system may maintain a list of the top five probable matches of a target event. If a user desired to view the top ten probable matches to the target event, then the user would be unable to do so.