Field of Invention
This patent relates to a method of automatically classifying, authenticating, and extracting data from documents of known format. One important type of document is a personal Identification Document (ID) such as a driver's license, but the invention can be applied to many other types of fixed format documents (i.e. currency, stamps, securities, certificates, permits, invoices, forms, etc.) Images of any type of subject, which can be grouped into classes based on similar properties, will benefit also from this invention. The Pairwise Comparison Nodal Network (PCNN) classification methods described are applicable to most pattern recognition tasks where some assignment of objects to classes is performed. This part of the patent has much broader implications for use than just documents.
Discussion of Related Art
Until recently, the examination and assessment of IDs was usually carried out by human interaction. With training, many examiners are very good at determining the authenticity of an ID or detecting alterations to it. However, the problem has become significantly more difficult as the number of government issued IDs alone has grown to more than 2000 active types and many more that have simply passed an expiration date.
Document inspectors and security personnel cannot be expected to memorize the detailed features of the thousands of different identity document types. Humans are susceptible to fatigue, boredom, distraction, intimidation, job dissatisfaction, bribery, and blackmail. Time constraints on processing travelers at an airport, customers in a line, or patrons in a queue outside a club, or other transaction points, make it difficult to effectively use reference material and inspection aids such as magnifiers, ultraviolet light sources, and measurement tools. These approaches are slow, tend to be inaccurate, and are subject to constraints on security/accuracy.
The motivation for adding machine-readable features to IDs was almost entirely a result of efforts to reduce throughput times. Design standards were developed for international documents such as passports which led to the addition of machine readable zones (MRZ) using the OCR-B font on passports and other types of IDs. Many U.S. driver's licenses originally adopted magnetic stripes but more recently they have been displaced by 2D bar codes (PDF-417 format) under better ID security standards influenced by the REAL-ID Act. OCR-B, barcode, and magnetic stripe readers became common means to automate the reading of IDs and passports.
However, the ability to read the data from an ID document does not equal the ability to authenticate it. An added complication has come from the very technology used to create the newer very sophisticated IDs. The cost of the equipment and supplies needed has plummeted and the access to them and the knowledge of how to manufacture a reasonable facsimile to them has become as close as the Internet. The demand is so large that, through the Internet or via local entrepreneurs, one can simply order customized fake IDs containing your biometrics and whatever personal information one specifies. It became commonplace for fake IDs to be so good that even trained personnel have difficulty distinguishing real IDs from fake ones.
A class of devices known as ID Reader-Authenticators came about in order to help address this problem. The current generation of document Reader-Authenticators automatically identifies the ID and examines overt and covert security features in combination with micro-examination of the inherent and often unintended details, of the issuer's specific production process. As an assistant to a human inspector, these devices overcome human vulnerabilities and actually can audit the process for intentional or unintentional human failures. They examine the ID under multiple light sources using many points of authentication. Some manufacturer's devices perform better than others; however, most are expensive and require extensive memory, storage, and processing capability.
Even in situations where these resources are not an issue, current systems usually require human training of the properties to be used for identifying a document class and what regions on the ID and measurements to use for authenticators. The high-quality forensic expertise required to train these systems to recognize and analyze a document is a limitation on the scalability and dependability of the document classification and the accuracy of the authentic/altered/fake decision. The problem is compounded by the time required for human training due to the variety and complexity of today's high-security IDs. The memory constraints, processing requirements, and training time per feature result in use of only a few points of comparison. This means a reduction in the determinants that can be used to make a decision. For training new types of documents, there is also a lag time for training and testing. With the current automated approach, the lag time for training is considerably shortened.
As technology has advanced, new capabilities such as cloud computing, smart cell phones, and tablets offer the potential for dramatic changes in the way we approach identity verification. Mobile devices with integrated cameras, displays, and respectable processors open the possibility of identity verification at a much lower price point and in many applications that have been cost and performance sensitive. Adoption of this technology requires an ID classification and authentication approach which will operate faster on lower performance devices with less memory and storage.
With cloud or enterprise solutions relying on servers for the processing power, other factors come into play. These include network performance, reliability and vulnerability for real-time processing applications, as well as concern over infrastructure vulnerabilities. There are many applications that can take full advantage of the trend and many for which there is no alternative. However, some are critical and almost total reliant on network availability and secure identity verification assumes the risk of broad failure if availability is lost due to acts of nature, to infrastructure failure, or deliberate attack.
All applications can benefit from a “fall back” mode. This invention removes most of the current human limitations, provides more thorough and reliable authentication, and makes it faster and simpler to add new document types. The reduced requirement for processing power, memory, and storage enables solid performance ID authentication in a stand-alone mode on many mobile devices. It enhances performance on PC platforms, and also enables dedicated network appliances using dedicated devices or commercial mobile devices.