The invention generally relates to courtesy amount recognition of financial documents. More specifically, the invention concerns a recognition system utilizing multiple images and recognition engines.
Banks, credit unions and other financial institutions regularly process checks, deposit slips, remittance stubs and other types of documents in order to execute financial transactions efficiently. Automated document processing systems have therefore become quite prevalent in the industry. It is common for these document processing systems to generate electronic images of the documents being processed so that computerized user applications can make use of the information contained on the documents. In order for these user applications to make the most use of electronic images, some form of character recognition must typically be performed on the images.
A common example of the above need for character recognition can be found with the standard negotiable check. For example, it is highly desirable to be able to determine the courtesy amount numerically written on the check via computer. This capability would allow an institution to compare a remitted check with a known account balance in an extremely efficient manner. Numerous approaches have been made at courtesy amount recognition (CAR) using various types of images such as JPEG, JPEG Snippets, and CCITT images, and are well known in the art. More complicated recognition engines have even made considerable headway in legal amount recognition (LAR) in conjunction with CAR to further improve recognition rates.
It is important to note that generally, there are not well defined standards for all of the worlds financial documents. For example, the amount fields themselves have multiple formats, e.g. different leading monetary symbols such as $, *, or no symbol at all. In addition to the absence of standards, some documents have multiple amounts printed on them. Many business checks have preprinted notations on them with amount field-like information in the notations, and deposit tickets will typically have more than one amount field. In general, recognition engines perform two fundamental steps in the recognition process. They must search the image to find the appropriate amount field, then once it is found they perform the recognition on the field. When a recognition engine indicates it cannot read the image, it frequently is the case that it could not find the field. Due to the absence of standards, typical recognition engines also need to be informed of image location information to assist the recognition process. This location information is typically obtained during the image capture process, and is frequently referred to as the document type.
While the above advancements have been made in CAR, recognition rates have failed to improve beyond certain thresholds. For example, it is highly desirable to maximize read rates and minimize misread rates. The read rate is the number of successfully read items, whereas the misread rate is the number of read items with misreads or substitutions in the string. Thus, the read rate is based on the total number of items on a document, and the misread rate is a percentage based on the total number of read items. For example, in the case of one hundred checks, it is typical that a total of one hundred recognition requests will be made to identify the courtesy amounts. If eighty-five of the read results returned indicate a successful read, then the read rate equals 85%. Similarly, if of the eighty-five, two are misread, then the misread rate will equal 2 divided by 85, or 2.35 percent. Misreads occur when a character is substituted with the incorrect character, dropped, or added to the actual information on the document. Under conventional CAR approaches, industry read rates have leveled off. Low read rates and high misread rates result in increased manual labor, reduced efficiency, and lead to increased costs. It is therefore desirable to provide a mechanism for CAR which increases read rates and decreases misread rates.
In a first aspect of the invention a document processing system includes an image capture system, a multiple engine recognition system, and an application system. The image capture system generates a first electronic image of a first region of a document, where the first electronic image has a first image format. The image capture system generates a second electronic image of a second region of a document, where the second electronic image has a second image format. The multiple engine recognition system transmits the first electronic image to a first recognition engine and a second recognition engine. The first recognition engine generates a first recognition result, and the second recognition engine generates a second recognition result. The recognition system further combines the first recognition result and the second recognition result into a final recognition result. The application system transmits the first electronic image from the image capture system to the recognition system and retrieves the final recognition result from the recognition system. Utilizing multiple recognition engines allows greater customization of the final recognition result, and therefore improves both read rates as well as misread rates.
In a second aspect of the invention, a multiple engine recognition system includes a data storage medium containing a first set of parameter data corresponding to a first recognition engine, and a second set of parameter data corresponding to a second recognition engine. The first recognition engine generates a first recognition result based on a first electronic image and the first set of parameter data. The second recognition engine generates a second recognition result based on the first electronic image, a second electronic image, and the second set of parameter data. The second electronic image is of a second region of the document. The multiple engine recognition system further includes a routing module for routing the first electronic image and the first set of parameter data to the first recognition engine. The routing module further routes the first electronic image, the second electronic image, and the second set of parameter data to the second recognition engine. The routing module also retrieves the first recognition result from the first recognition engine and the second recognition result from the second recognition engine. A runtime module retrieves the first set of parameter data and the second set of parameter data from the storage medium.
In a third aspect of the invention, a computerized method for recognizing information contained in a first region of a document includes the step of storing a first set of parameter data to a data storage medium, where the first set of parameter data corresponds to a first recognition engine. A second set of parameter data is also stored to the data storage medium, where the second set of parameter data corresponds to a second recognition engine. The method further provides for retrieving the first set of parameter data and the second set of parameter data from the storage medium, and routing a first electronic image and the first set of parameter data to the first recognition engine, where the first electronic image represents the first region of the document. The first electronic image, a second electronic image, and the second set of parameter data is routed to the second recognition engine. The method also includes the step of retrieving a first recognition result from the first recognition engine and a second recognition result from the second recognition engine. The first recognition result and the second recognition result are then combined into a final recognition result.