1. Technical Field
This invention relates to the field of optical character recognition and, in particular, to enhancing optical character recognition processes through a modular approach including the development of an error detection and correction log in memory and updating a database of rules for correcting erroneous data through real-time learning from source validation or long-term learning from the error detection and correction log.
2. Description of the Relevant Art
Referring to FIG. 1A, there is shown a process flow chart for known optical character recognition systems. The example shown involves receipt of incoming facsimile data, for example, at a facsimile machine or a personal computer. Similar reference characters denote similar steps throughout FIG. 1. Optical Character Recognition (OCR) 105 is employed to translate electronic bit-maps, for example, representing incoming facsimile data received at step 100 into computer characters. The original may comprise machine generated characters or symbols or graphic elements such as picture elements of a photograph or may comprise human hand-written characters, symbols or graphic elements. Bit-maps are simply digital representations of such data appearing in an original document or other tangible medium of expression or comprise digital copies thereof. An electronic bit-map, for example, may result from the electronic scanning of an original for facsimile transmission over telecommunication lines received into a facsimile machine or personal computer or other device at step 100. The density of bits in a contiguous cluster of bits of a bit-map portion may represent an alphabetic character or other symbol that may be computer recognizable.
For example, U.S. Pat. No. 5,542,006 to Shustorovich et al. discloses OCR neural network based apparatus for locating a center position of all desired characters within a field of characters such that the desired characters can be subsequently recognized using an appropriate classification process. Roughly, OCR 105 operates to translate each visual image of the bit-map into a symbol, character, graphic or, generally, a member of a computer's allowed alphabet which may comprise character or graphic symbol data. For the most part, optical character recognition operates very well. However, for example, characters or symbols of a computer alphabet that look like numbers or vice versa (the letter O and the number 0 or the letter I and the number 1 or the letter z and the number 2) may be confused during OCR processes. It is an object of the present invention to correctly interpret what is presented in an original document or, in other words, increase the probability that an OCR device will properly translate clusters of bit portions of a bit-map accurately into computer intelligible characters or graphic data. The output of the OCR device should be properly translated into computer intelligible characters, symbols or graphic elements.
It is known in OCR 105 how to recognize when the processes may be failing for one reason or another and the OCR process 105 can raise a potential-error flag. The OCR device may not recognize a given bit-map portion at all, and improper recognition of a symbol, character or graphic element should be avoided. An OCR device may compare against a given confidence level for character recognition and raise a flag when the confidence level is not met, signaling a potential error. The result of the flag may be an error display or the like such that an operator at step 110 may review the error flag and, perhaps, take remedial action, if the operator believes action is required. The corrected bit map is then stored in a database (DB) at step 115.
Data rules for correcting erroneous data are known, for example, as described in the article "The Forms Processing Paradigm Shift," appearing in Imaging Magazine, March, 1995 at pages 84-106. These rules comprise a vast number of business rules that can be based, for example, on form content as described in this article or in U.S. Pat. No. 5,555,101 to Larson or U.S. Pat. No. 5,608,874 to Ogawa et al. Rules may also be based on business/data logic, for example, as described by R. G. Ross, The Business Rules Book, Classifying, Defining and Modeling Rules, Database Research Group, 1997. Ross provides a taxonomy of rules used in the business community.
Referring to FIG. 5, which comprises a sample order form of a make-believe corporation, XYZ Corporation, the taxonomy of some simple rules and types of rules will be illustrated by way of some specific examples from the depicted form. A first type of rule is an instance verifier. These rules require that the proper data fields be populated. For example, if the NAME field 500 on the facsimile form is not filled in, a rule of this sort is failed. Another example is if neither the "Requesting our newest catalog" field 501 or the "Ordering merchandise" field 502 is checked, a rule of this sort is failed.
A second type of rule is a type verifier. These rules cover a variety of relationships between or among data fields which may or may not be completed. For example, if two boxes are checked for given ranges of annual income 505, then a rule of this sort is failed or, if either or both the "Requesting our newest catalog" field 501 or "Ordering merchandise" field 502 is checked, but no address is provided in address field 510, then a rule of this sort is failed.
A third type of rule is a position verifier. This type of rule requires that certain ordering logic inherent in the definition of the data fields be followed. The ordering may be time or alphabetic or other expected ordering. For example, if the "Today's date" field 515 has a value that is chronologically later than the value of the date in the "Date Merchandise Desired" field 520, then a rule of this sort is failed.
Yet another type of rule is a functional verifier. These rules specify functional relationships among data fields. For example, if the value of the "Catalog ID" field 525 is the same as the value of the "Catalog ID" field 530, then a rule of this sort is failed.
Another type of rule is a comparative evaluator. These rules specify comparative relationships among data fields. For example, if the value of the "Qty" field 535 contains a letter (instead of a number), then a rule of this sort is failed.
A mathematical evaluator is another type of rule. A known or predetermined mathematical relationship must hold for the rule to pass. For example, if the "Subtotal" field 540 has a value that is not equal to the sum of the values of each of the Item Totals, for example, field values 541 and 542, then a rule of this sort fails. Another example is when the Tax field value shown in field 550 is not the correct multiple of the rate 551 and the Subtotal 540.
Naturally, these atomic rules can be combined in an almost unlimited number of ways to create very complicated rules. It is an object of the present invention to apply such a set of rules for correcting erroneous data and combinations thereof in OCR processes.
It is also known in the art to apply error correction logic. Many errors may be corrected automatically. Those that can be corrected are corrected, for example, in known word processing programs, for example, the instance of a small letter i appearing in a sentence may be automatically corrected to a capital I. For example, in a context based system, in fields that must be numeric, the letter O can be changed to the number 0, the letter S can be changed to the number 5 and so on. Of course, in a field that must be alphabetic, numbers may be changed to letters automatically.
There is also a well-established body of logic for correcting errors in street addresses. The logic uses the postal standards as a look-up table in memory to compare street address ranges, city, state and zip or postal code combinations.
In a telecommunications system, it is appropriate, if at all possible to correct errors at a source by applying source correction routines. If an error occurs at a source, due to transmission losses and the like from source to destination to destination, errors can perpetuate and multiply. The output of an OCR device at a final destination likely will not be acceptable. Consequently, it would be appropriate for the final destination to query the original source if possible.
Finally, it is known in some fields to apply learning based algorithms to update a database to reflect current data as data may change over time. It is an object of the present invention to permit OCR devices to learn from verified character recognition and verified context-based recognition of character or symbol strings.
Despite all the known processes and rules described above, there remains an opportunity in the art of optical character recognition to provide enhanced performance.