1. Field of the Invention
The present invention relates to a computer-based automated fingerprint classification and identification system.
2. Related Art
Since the turn of the century, fingerprint identification has been the most widely accepted method for positively establishing the identity of an individual. It is used by law enforcement agencies throughout the world to determine or verify the identity of criminals. It is also being used increasingly to verify the identity of individuals who have applied for passports, drivers licenses, or other official documents and to verify that persons applying for certain types of employment or for security clearances do not have a criminal history.
Automated Fingerprint Identification Systems (AFIS) were introduced commercially in the early 1980s. Studies by law enforcement agencies have since demonstrated that such systems dramatically increase the number of fingerprint comparisons a given agency can perform per day. For example, in California, where state law enforcement agencies use one of the world's most advanced AFIS networks to perform searches on a database containing fingerprints for 7.5 million people, state officials reported productivity gains of 300 to 400 percent shortly after the automated system was implemented. (Wilson and Woodard, Automated Fingerprint Identification Systems: Technology and Policy Issues, U.S. Department of Justice NCJ-104342, April 1987, page 12) (incorporated herein by reference).
Studies have also shown that automated fingerprint identification systems are more accurate than manual systems. (Accuracy, as applied to fingerprint searches, refers to the ability of the human or electronic searcher to correctly locate a match for an unknown fingerprint card when a match in fact exists in the file or database.) In two national surveys performed since 1979, manual fingerprint searches were found to be, at best, 74 percent accurate; in contrast, AFIS users typically report accuracy rates of 98 percent or higher. (Wilson and Woodard, page 12)
Despite the proven benefits of AFIS technology, the vast majority of the world's law enforcement agencies continue to perform fingerprint searches manually. Furthermore, those agencies that have AFIS's are often unable to fully exploit the potential these systems provide. The reason in both cases has to do with a design characteristic common to all commercially available AFIS's. The characteristic, which involves the use of parallel processors (e.g., high powered supercomputers) to reduce the amount of time required to perform a single fingerprint comparison (thereby increasing a system's production capacity), drives up the cost of AFIS technology to a level that prevents many law enforcement agencies from automating at all and forces some that do automate to adopt cost-saving practices that reduce the benefits gained by automating.
The concept of using a classification system to limit the number of comparisons against which an unknown fingerprint must be compared to find a match is not new. Manual fingerprint classification systems have been used for more than 100 years to achieve precisely this goal. Neither is the concept of using an automated classification system new. The National Institute of Standards and Technology (U.S. Commerce Department, Technology Administration) derailed the economic advantages of combining automated classification components with existing AFIS technology in a 1992 critique of the U.S. Federal Bureau of Investigation's planned AFIS. (McCabe, et. al., Research Considerations Regarding FBI-IAFIS Tasks and Requirements, NISTIR 4892, August 1992) (incorporated herein by reference). The report pointed out, however, that although work on fully automated fingerprint classification systems has been underway since the mid 198Os "no viable approaches were ever fully developed." (McCabe, page 12, emphasis added).
In a more recent NIST report, the authors identified four major approaches (structural, syntactic, rulebased, and artificial neural network) that have been used as the basis for automatic fingerprint classification systems that have been reported in the literature to date (G. T. Candela, R. Chelleppa, Comparative Performance of Classification Methods for Fingerprints, NISTIR 5103, April 1993, page 5) (report incorporated herein by reference). According to the authors, all past attempts have focused on automating the Henry Classification System or newer one-tier classification systems which divide fingerprints into five or seven categories.
Conventional AFISs are described in greater detail in the following paragraphs.
A. How Automated Fingerprint Identification Systems Work
1. Comparison to Manual Systems
The automated fingerprint identification system is most easily understood by comparing it to a manual fingerprint identification system. In functional terms, the manual fingerprint system consists of these elements:
(1) A Classification System under which fingerprints are grouped according to some visually distinguishable set of features and filed by category. PA1 (2) A Data Input System used to create the fingerprint cards. Fingerprint cards in a manual system are generally produced manually by inking the fingers and pressing the finger on a card in a prescribed location; however, commercially available inkless fingerprint scanning devices can also be used to produce cards. PA1 (3) An Archive where fingerprint cards are filed in accordance with the classification method used by the organization. PA1 (4) Technicians who visually examine unknown fingerprints submitted for comparison, determine the classification of each unknown fingerprint, compare the unknown fingerprint against file fingerprints in the same classification, and determine if a match exists. PA1 (1) Controller Subsystem. The controller is a central computer or central processor which receives commands from the user and directs the activities of other subsystems. PA1 (2) User Interface Subsystem. This encompasses the hardware, firmware and software used to permit the user to enter commands and to see and use data displayed by the system. PA1 (3) Fingerprint Input Subsystem. Fingerprints are input into an AFIS either by direct electronic transfer from a commercially available, inkless fingerprint input terminal or by using a commercially available scanner to input in a manually produced fingerprint card. PA1 (4) Template Creation (or Encoding) Subsystem. This subsystem selects two fingerprints from each fingerprint card input into the system (usually the left and right thumbs or index fingers) and determines the number and location of minutiae on each of the selected fingerprints. For each selected fingerprint, it then creates and stores a file, called "template" or "minutiae map" containing the coordinates of the minutiae. PA1 (5) Image Storage Subsystem. This is the "archive" where file fingerprints and fingerprint templates are stored. Because fingerprint files consume a large amount of memory, fingerprint files are stored on optical media in many systems and the templates used for comparison are stored in magnetic memory. PA1 (6) Searcher (Matching Subsystem). This subsystem, which generally takes the form of one or more stand-alone computers equipped with special software, compares the template of an unknown fingerprint against templates in the system's database and generates the matching scores described above. PA1 (7) Component Interface Subsystem. This subsystem, which usually takes the form of a local area or wide area network, provides the means for transferring data among the system components. PA1 (1) Secondary Classification. At this level, fingerprint cards within the same primary classification are subdivided according to the patterns of the index fingers. Index fingers are classified into one of five pattern types, meaning that each primary classification can be theoretically divided into 25 secondary classifications. PA1 (2) Subsecondary Classification. Fingerprint cards within the same primary and secondary classification are classified according to the patterns (three) of the middle, ring, and index fingers, meaning that each secondary classification can be theoretically divided into 81 subsecondary classifications. PA1 (3) Major Division. Fingerprint cards within the same primary, secondary, and subsecondary classifications are classified according to the patterns (three) of the thumbs, meaning that each subsecondary classification can be theoretically divided into 12 major divisions. PA1 1. The application of increasingly complex rules adds time to the classification process, driving up labor costs and reducing productivity. PA1 2. The application of increasingly complex rules increases the risk of misclassification which, in turn, increases the incidence of missed identifications.
Using the above-described system, the process for positively establishing identity revolves around three critical steps: (1) the creation of a fingerprint card for the unidentified person; (2) the classification of each fingerprint and of the fingerprint card itself; and (3) a comparison of the unknown person's fingerprints to all fingerprint cards that are within the same classification as the unknown card. The search is effected by comparing the minutiae points of one fingerprint on the unknown card (i.e., the points at which a ridge in the fingerprint pattern ends or at which two ridges meet) to the minutiae points for the corresponding finger on each of the fingerprint cards from the file. When a match is found, the minutiae points for one or more of the remaining fingerprints on the fingerprint cards are compared to confirm the match. Where there is an exact correlation, or an acceptably high degree of correlation, between the minutiae points of the unknown fingerprints and a set of prints from the file, the identity of the unknown person is positively established.
Because the process of manually comparing fingerprints is time-consuming and therefore expensive, most law enforcement agencies actually begin the fingerprint comparison process with a "name search", meaning the technician first checks the name on a known fingerprint card against a master index containing the names and other identity data from all of the fingerprint cards in the file. If the technician finds a name match, he/she compares the incoming card to the fingerprint card on file under the same name. If statistical information from the State of California holds true in other jurisdictions, approximately 47 percent of all criminal fingerprint checks are completed on the basis of a name search alone. (Wilson and Woodard, p. 3) When no match is found through a name search, the technician classifies the incoming card and proceeds with a "full search", meaning the unknown card is compared one by one to all cards in the same file classification until a match is found or it is determined that there is no match. In the case of fingerprints taken from crime suspects, the full search generally yields a "hit" rate of 8 percent, meaning that between the name search and the full search, roughly 55 percent of all criminal fingerprint searches result in a match being found in the file. (Wilson and Woodard, page 3)
In contrast, in non-criminal fingerprint comparisons, FBI data indicate that the name search results in a "hit" five percent of the time and the full search in a "hit" only 1.5 percent of the time. (Wilson and Woodard, page 3). Given this low hit rate and the high cost of performing a full search, most law enforcement agencies limit non-criminal fingerprint checks to a name search only. This means, of course, that a small number of people who would be identified as criminals through a full search are erroneously identified as non-criminals (i.e., people using aliases).
2. What is an AFIS?
An AFIS is a specialized grouping of equipment used to electronically store fingerprints and, by applying pattern recognition techniques, to perform the same comparison of minutiae points that a fingerprint technician performs manually. Functionally, an AFIS encompasses all of the elements of a manual fingerprint identification system, with the exception of the classification system. Specifically, it consists of these components:
Using the above-described system, the process for positively establishing identity revolves around three critical steps: (1) scanning the full set of fingerprints from an unknown person in the system, (2) creating the set of templates used in the comparison process, and (3) comparing the set of templates from the unknown person's fingerprints to a similar set of templates taken from every set of fingerprints stored in the database.
After comparing the first unknown template (e.g., the template of the right thumb), to all of the corresponding templates stored on the system (e.g., all right thumb templates), the system assigns a "matching score" which indicates the degree of similarity between the unknown template and each of the file templates: the higher the score, the stronger the likelihood that the unknown template and a file template are from the same fingerprint. The scores are sorted in descending order and the highest scores are generally presented to the system operator in the form of a short "candidate list." When a score is above a certain threshold (defined by the agency using the system), a match is considered to have occurred. When the system finds a match, the operator generally requests that the system display both the unknown template and the matching template and he or she visually confirms the match. The operator also decides, based on agency policies, whether it is necessary to compare the template from the second hand.
While the AFIS is extremely accurate, it has an Achilles Heel (i.e., it has significant drawbacks). As the discussion above has made clear, the AFIS, as currently designed, must compare every set of templates created for an unknown person to the templates for every fingerprint record in the database. Because the automated searcher can only compare about 1,000 templates per second, and because many agencies have millions of fingerprint records in their databases, agencies with large databases must use multiple searchers to simultaneously search segments of the template file. The table below shows the impact of this hardware-intensive method of increasing search capacity assuming a database size of 2 million records.
______________________________________ Number of Number of Searches that Can Estimated Cost Searchers be Processed per Hour of Searchers ______________________________________ 1 1.8 $210,000 10 18.0 $2,100,000 100 180.0 $21,000,000 ______________________________________
In practice, the direct correlation between productivity gains and cost manifests itself in two ways. First, despite the fact that automated fingerprint searches are known to be far more accurate than manual searches, only a small percentage of the law enforcement agencies in the U.S. use AFIS technology. Second, cost considerations force many agencies that do automate to settle for less-than-optimum level of automation. Stated more specifically, due to high cost of AFIS technology, most agencies that have automated continue to perform only a name search when presented with a request for a non-criminal fingerprint search. This means that criminals using false names continue to go undetected in routine fingerprint searches performed by AFIS equipped law enforcement agencies.
Given the above, it is clear that what is needed is a method to reduce the amount of time required to perform a fingerprint search on an AFIS without proportionally increasing the number of searchers in the system.
B. Automated Fingerprint Classification Systems: A Less Costly Means of Increasing Search Capacity
For more than a century, law enforcement agencies have been confronted by the need to increase the search capacity of their manual fingerprint systems without increasing the number of searchers (technicians). Until the advent of the AFIS, the universal response to this challenge was to file fingerprints according to some classification scheme and, when seeking a match for an unknown person's fingerprint card, to search only the portion of the file containing fingerprints of the same classification.
It is obvious to those involved in the development of AFIS technology that a classification system can also be used to reduce search time in an AFIS. Described simply, the use of an automated classification system means that a single searcher, or a small number of searchers, could perform tasks that currently require tens or hundreds of searchers. What is not obvious is the specific classification system that should be used. At least one effort to automate the Henry Classification Method has been reported. (Chang, et. al., Fingerprint Classification with Model-Based Neural Networks, abstract presented at National Institute of Standards and Technology on Criminal Justice Information systems, Washington, D.C., September 1993) (Incorporated herein by reference). Other efforts focusing on the automation of a simple seven-category classification system recently developed by the FBI are also ongoing (McCabe: Candela). Both the Henry classification system and the newer classification systems that have been developed have shortcomings which are described below.
1. The Henry Classification System
Despite the fact that the Henry Classification System is the most widely used fingerprint classification system in the world, experts agree that this classification system is unnecessarily complex and that it is particularly ill-suited as a classification system for large fingerprint files. Its shortcoming in this regard has its roots in the first step of the classification process. In the first step of the Henry classification process, each fingerprint on a 10-fingerprint card is classified and assigned an alpha-numeric code corresponding to one of two primary categories: the whorl or the non-whorl. The fingerprint card is then classified using a designator derived from the code assigned to each finger.
By applying the two primary categories identified above to all ten fingers, a fingerprint file can be theoretically broken into a total of 1,024 distinct classifications (2.sup.10). In practice, however, fingerprints corresponding to some of the 1,024 classifications seldom if ever appear in a file and some classifications are extremely common. In fact, in the U.S., one primary classification in any file organized using the Henry System is likely to contain 25 percent of all of the cards in the file. In all but the smallest fingerprint files, it is therefore necessary to further subdivide the file. In recognition of this fact, the original Henry Classification System provides for three further levels of subdivision:
Various studies have shown that the largest Major Division in a fingerprint file organized using the classification scheme described above is likely to contain approximately 6 percent of the total number of fingerprint cards in the file. While this represents an adequate degree of division in a very small fingerprint file, it is positively inadequate for organizations such as the U.S. FBI or the State of California which have 23 million and 7.5 million cards on file respectively. Such organizations have been forced to add additional levels of subdivision to the original Henry System. Obvious drawbacks of using new subdivisions to remedy the inherent shortcomings of the Henry system are these:
While the above discussion focuses on the application of the Henry Classification System in a manual fingerprint system, it is pertinent to automated systems as well. Just as in a manual system, the extensive number of subdivisions required to achieve adequate segmentation of the database would impact the speed of the classification process and the potential for misclassification.
Thus what is needed is a classification system that results in adequate segmentation of the database without extensive use of subdivisions.
2. Seven-Category classification System Being automated by FBI
Recognizing the above, the U.S. FBI is supporting research and development on automated classification systems that use primary categories only and no subcategories. The previously cited Candela report described ongoing attempts to automate a five-category classification system and the previously cited McCabe report describes a seven-category classification system being investigated by the U.S. FBI. Despite the fact that a 7-category classification system, when applied to all ten fingers on a fingerprint card, creates the theoretical possibility for more than 2 billion separate file classifications, the McCabe report stated that the level of segmentation achieved by the seven-category classification system is, in practice, inadequate for large systems such as that of the U.S. FBI. To illustrate, the author pointed out that one category, the ulnar loop, will typically contain 6% of the records in a database. Given the size of the FBI's database and a processing requirement of 225 searches per hour, the report stated that the FBI would need 483 searchers to process the ulnar loop classification alone if the seven-category classification system were used (page 9).
Given the above, one can see that the ideal classification system is one that uses more primary categories than the Henry System and a less complex system of subclassification.
3. The Vucetich Classification System
Such a classification system exists and is a basis for the present invention. Developed by Juan Vucetich in the 1880s and introduced in Argentina, the Vucetich system is used by various Latin American countries and is widely recognized by experts as superior to and simpler than the Henry method.
As table A shows, the Vucetich classification method begins by assigning each fingerprint on a fingerprint card an alpha-numeric code corresponding to one of four primary categories. By applying four primary categories to all ten fingers (rather than the two applied by Henry), the Vucetich Classification System provides for a theoretical total of 1,048,578 distinct classifications (4.sup.10) (rather than the 1,024 provided by the Henry System). In practice, of course, only a small portion of the classifications that are theoretically possible actually appear in a fingerprint file. The Federal Police of Argentina (PFA), which maintains one of the largest files based on the Vucetich Classification System, reported in 1984 that its 6 to 7-million-card file contains only 3.5% of the classifications that are theoretically possible. (Rosset and Lago, El ABC del Dactiloscopo, Editorial Policial, Policia Federal Argentina, Buenos Aires, Argentina, I.S.B.N. 950-9071-08-0, 1984, page 98) (Incorporated herein by reference). The PFA also observed that certain classifications are far more prevalent than others; however, they indicated that the largest primary classification in their files contains about 200,000 records, or approximately 3.5% of the total records in their file--a level of subdivision of achieved by the Henry method after applying four levels of classification and subclassification! By applying a single level of subclassification to the primary Vucetich loop classifications, the PFA reported that most of the resulting subdivisions contain between 20 and 50 cards, while a few contain up to 150 cards, or 0.0025 percent of the cards in the file (page 100).
Table A also demonstrates that increasing the number of primary categories in a fingerprint classification scheme beyond the four categories used by Vucetich does not necessarily result in greater segmentation of the file. Although the seven-category system described by McCabe theoretically results in more than 2 billion separate file classifications, McCabe reported that the largest file classification could still be expected to contain 6 percent of the file's records--the same number reported by the Federal Police of Argentina using Vucetich's four primary classifications.
TABLE A __________________________________________________________________________ 7-Category Vucetich Henry System System System (McCabe) __________________________________________________________________________ Primary Classification Categories Number of primary 4 2 7 categories Theoretical number of 1,048,578 1,024 &gt;2 billion resulting file subdivisions Approximate size of largest = 3.5% = 25% = 6% file subdivision after primary of records in file of records in file of records in file categories have been applied Subcategories Levels of Subdivision one three none Approximate size of largest = 0.0025% = 6% not file subdivision after all of records in file of records in file applicable levels of subdivision have (in loop been applied subcategories) __________________________________________________________________________