The present invention provides apparatus and method for verifying the identity of a person by comparing that person's face with a facial image generated using data stored on an identification card, badge or tag carried by the person. The present invention also provides apparatus and method for verifying the identity of a person by extracting pattern signatures from an image of that person's face and comparing those signatures with data stored on an identification card, tag, or badge carried by the person. The system of the present invention is expected to find use in a wide variety of cases where a person's identity is to be established (e.g. by a retail customer using a credit card, by a traffic officer who needs to verify the identity noted on a driver's license, by an industrial security system that regulates the access by selected personnel into a secure area of a plant or business, or by a customer of an automatic teller machine.)
The prior art is replete with verification systems that have attempted to store a digitized, encoded representation of an identifying image on a card, badge, etc. that could be easily carried by a person. A major problem that has been recognized implicitly or explicitly by many prior art inventors is that of securing adequate memory capacity for storing an encoded representation of a person's face on a medium that can be easily carried and that can be read out and displayed or analyzed at the point of identification. Notable among the prior art patents are the following:
U.S. Pat. No. 3,805,238, wherein Rothfjell teaches an identification system in which major features (e.g. the shape of a person's nose in profile) are extracted from an image and stored. The stored features are subsequently retrieved and overlaid on a current image of the person to verify identity. Rothfjell attacked the data capacity problem by using only parts of an image--i.e. a feature set that could be represented by a small number of pixels.
U.S. Pat. No. 4,449,189, wherein Feix et al provide an identification system in which a spoken phrase and a digitized image of the speaker's mouth at the time he or she is uttering the phrase are recorded on a card. When the person's identity is to be later verified, he/she speaks the same words and both the vocal signature and the mouth image are compared with the recorded data. Feix et al address the problem of video data reduction by using only a restricted portion of the available image. The disclosure of Feix et al is herein incorporated by reference.
U.S. Pat. No. 4,712,103, wherein Gotanda teaches, inter alia, storing a digitized facial image in non-volatile ROM on a key, and retrieving that image for comparison with a current image of the person at the time he/she requests access to a secured area. Gotanda describes the use of image compression, by as much as a factor of four, to reduce the amount of data storage capacity needed by the ROM that is located on the key.
U.S. Pat. No. 4,754,487, wherein Newmuis teaches a system for storing a facial image for subsequent display. In order to reduce the amount of data that needs to be stored, Newmuis uses a variable sampling rate. A high spatial frequency is employed in critical portions of the face (Newmuis teaches that a "T"-shaped region encompassing the eyes, nose and mouth is most important). Lower spatial sampling rates are used for other portions of the facial image.
U.S. Pat. No. 4,811,408, wherein Goldman teaches a document identification system that compares selected portions of a photographic image on an identification card with encoded representations of those selected portions and alerts an operator as to points of mis-match. Goldman's use of selected portions of the image serves to reduce the amount of data that needs to be stored.
U.S. Pat. No. 4,858,000 wherein Lu teaches an image recognition system and method for identifying ones of a predetermined set of individuals, each of whom has a digital representation of his or her face stored in a defined memory space. The system of this invention also provides means of locating, within a monitored area, an individual face that is to be identified. The disclosure of U.S. Pat. No. 4,858,000 is herein incorporated by reference.
U.S. Pat. No. 4,991,205, wherein Lemelson teaches encoding physical characteristics (one of which is a scrambled video image of a face) on a magnetic stripe of the type that is commonly seen on credit cards. The encoded characteristic is used in identification systems that may be manual (e.g. the recorded image is reconstituted on a CRT so that an operator can compare the person carrying the card with the re-constituted picture) or automatic (e.g. a recorded voice print is compared with a special phrase spoken into a microphone at an access control point). Although Lemelson specifically recites encoding one or more full frames of video on a magnetic medium on a card, and then reading and displaying that video image, the subsequent discussion will show that such a scheme would not work if the magnetic medium were constrained to have the very low total memory capacity of a standard credit card magnetic stripe.
U.S. Pat. No. 4,972,476, wherein Nathan teaches scrambling and encoding a picture of a portion of a person, and storing that representation on an ID card. The reconstituted image is superimposed on a current image of the person so that a clerk or guard can check the degree of correspondence. Nathan's use of a restricted portion of a facial image (e.g. an ear) reduces the amount of data that needs to be stored within the limited capacity of a magnetic stride on a card.
U.S. Pat. No. 4,975,969, wherein Tal teaches an image recognition system and method in which ratios of facial parameters (which Tal defines as distances between definable points on facial features such as a nose, mouth, eyebrow etc.) are measured from a facial image and are used to characterize the individual. Tal, like Lu in U.S. Pat. No. 4,858,000, uses a binary image to find facial features. The disclosure of Tal is herein incorporated by reference.
U.S. Pat. No. 4,993,068, wherein Piosenka and Chandos teach an automatic personal identification system in which biometric data specific to a person to be identified are carried by that person in an escort memory, an automatic comparison is later made between those stored biometric data and corresponding biometric data collected at a place and time at which the person is to establish his or her identity.
U.S. Pat. No. 4,995,086, wherein Lilley and Ridgeway teach a method of recording a feature set that characterizes a fingerprint on magnetic stripes on an ID card. Rather than store an entire fingerprint image on the card, Lilley et al analyze the fingerprint and store an encoded representation of the degree of correlation between the card bearer's fingerprint and a standard reference fingerprint. A readout machine has a fingerprint sensor and compares the stored data with corresponding data extracted from a current fingerprint.
U.S. Pat. No. 5,031,228, wherein Lu teaches an image recognition system and method for identifying ones of a predetermined set of individuals, each of whom has a digital representation of his or her face stored in a defined memory space. Face identification data for each of the predetermined individuals are also stored in a Universal Face Model block that includes all the individual pattern image or face signatures stored within the individual face library. The disclosure of U.S. Pat. No. 5,031,228 is herein incorporated by reference.
U.S. Pat. No. 5,063,603, wherein Burt teaches an image recognition system using differences in facial features to distinguish one individual from another. Burt's system uniquely identifies individuals whose facial images and selected facial feature images have been learned by the system. Burt's system also "generically recognizes" humans and thus distinguishes between unknown humans and non-human objects by using a generic body shape template. The disclosure of U.S. Pat. No. 5,063,603 is herein incorporated by reference.
U.S. Pat. No. 5,053,608, wherein Senanayake teaches the use of a personal ID card system that has a fingerprint encoded on it. The card also has a special space where the bearer can temporarily leave his fingerprint. A reading machine decodes the encoded fingerprint (e.g. stored on a magnetic stripe) and compares it with the current fingerprint made on the special space in order to verify identity.
U.S. Pat. No. 5,164,992, wherein Turk and Pentland teach the use of an Eigenface methodology for recognizing and identifying members of a television viewing audience. The disclosure of Turk et al is herein incorporated by reference.
Although many inventors have offered approaches to providing an encoded facial image that could be compared, automatically or manually, at some later time to verify that a card-bearer is indeed the properly authorized card-holder, none have succeeded in producing a viable system. Part of the reason for this lies in the severe constraints imposed on the image storage aspect of a system by commercially available read-out apparatus that is widely employed for reading data stored on magnetic stripes on credit cards and the like.
The reading equipment that is used for retrieving data stored on credit cards commonly calls for an operator to manually move the magnetic stripe on the card through a slot that contains a read-out head. The equipment must thus tolerate both a wide range of speeds used by different operators, and variations of speed during a single scan. Because of these constraints imposed by manual scanning, data are conventionally stored on magnetic stripes on credit cards at a very low density. Financial transaction cards that are in widespread use are defined in ISO Standard 7813 and conventionally have three such low density tracks. Track 1 (defined in Standard ISO 7811-4) was developed by the international Air Transport Association and can contain up to 79 alphanumeric characters, using 7 bits per character. Track 2 (also defined in Standard ISO 7811-4), was developed by the American Bankers Association and contains up to 40 numeric characters at 5 bits per character. Track 3 (defined in ISO 7811-5) was developed by the Thrift Industry and contains up to 107 characters at 5 bits per character.
Comparing the data storage space available on a card (25-69 bytes per track and a maximum of 3 tracks per stripe for a total of 160 bytes), with the data generated by digitizing a video frame (on the order of 0.25-1 million bytes) shows the scope of the problem that is to be solved if the credit card format is to be used. Small amounts of image compression, as taught e.g. by Gotanda, overflow the available memory a thousand-fold. Sophisticated data reduction methods, such as those provided by a recent JPEG standard, can reduce an image to about 10,000 to 50,000 bytes, which is still a factor of 60-300 more than is available on the entire card. Moreover, since much of the data storage space on a card is likely to be reserved for other purposes (e.g. Track 1 may store the cardholder's name, and Track 2 may have a personal identification number for use with an automatic teller machine) the amount of space left for image storage is even smaller (continuing the example started above, one would find a total space of 66 bytes on Track 3 available for storage of an image). Thus, the best known methods of image compression would require 180-1000 times more memory space than is available on a single track.
Other storage media, such as an optical memory card configured according to the de facto "DELA" standard (compatible with ANSI Draft Standard X3B10.4), can provide adequate memory capacity to store a facial image or digitized record of some other characteristic (e.g. a fingerprint), and have been considered for use in identity verification systems. Such systems have not been widely successful, partly because of the higher cost of optical cards vs magnetic stripe cards, but mostly because of the lack of a network of readers (which are far more expensive than are the comparable magnetic stripe readers).
Several authors, among them M. Turk and A. Pentland, ("Eigenfaces for Recognition", Journal of Cognitive Neuroscience, vol. 3, no. 1, pp 71-86, 1991) have taught the use of an "Eigenface" approach to face recognition. In this method a standard set of faces (or features) that span the gamut of faces (or features) that are to be encountered is initially defined, and all subsequent facial (feature) images are expressed as a weighted combination of the standard set. Turk and Pentland's Eigenface approach is a specific example of principal component methodologies that seek to express a variable (in this case, an image) as a combination of principal components. A more familiar principal component method is the use of latitude and longitude to locate a location on a map.
The use of principal component (or Eigenvector) mathematical theory in a variety of signal processing and signal reconstruction applications is well-known. Illustrative of the varying applications of this methodology are:
U.S. Pat. No. 5,009,143, wherein Knopp teaches the use of principal component methods to reconstruct preselected musical signals. The disclosure of Knopp is herein incorporated by reference. PA1 U.S. Pat. No. 5,031,155, wherein Hsu teaches the use of both Karhunen-Loeve transformations and Hilbert transformations to form Eigenvector representations of well-logging data. Hsu teaches the use of already established principal component methods to: 1) provide a compressed representation of sonic signals (in terms of eigenvectors and other parameters): 2) process the sonic data (by deleting some eigenvectors and thereafter using a reduced set of eigenvectors to characterize the data) in order to remove noise and measurement artifacts; and 3) reconstruct the wave components by inverse transformations. The disclosure of Hsu is herein incorporated by reference. PA1 U.S. Pat. No. 5,179,598, wherein DiFoggio and Burleigh teach the use of eigenvector methods to determine which portions of an image are of the same color. The disclosure of DiFoggio et al., is herein incorporated by reference. PA1 a personal identity card having an image memory capacity of less than 120 bytes, wherein one may store encoded image parameters that are characteristic of an authorized card-bearer, and that are keyed to standard reference system, PA1 a card reader in which the identity card is manually moved past a read-out head, PA1 and a verification apparatus that contains a standard reference signature set combinable with the encoded image parameters to provide an image of the card-bearer's face.
Lu et al, in U.S. patent application Ser. No. 07/872,881, teach the use of an "Eigenface" or "Eigenfeature" approach for recognizing persons who shop multiple times at a monitored retail store. The disclosure of Lu et al is herein incorporated by reference. The system of Ser. No. 07/872,881 employs Eigenface analysis as a means of speeding the process of recognizing (rather than verifying) ones of a number of known people. Lu et al did not teach a physical partitioning of computer memory so as to place the standard set in one physical location (e.g. on a magnetic disk drive associated with the recognition computer) and the set of weights that related a predetermined individual's face to faces in the standard set in another location (e.g. on a magnetic stripe on a credit card).