This invention relates to computerized document image processing systems in which data is automatically read from one or more images, and more particularly, to identifying which datum is the most likely source of error in the event that the automatically read data cannot be reconciled.
Document image processing systems have proven themselves as an efficient means for handling the ever increasing mountain of paperwork confronting businesses today. It is well known that image processing systems optically scan a document and form a digitized image. The document image is then electronically stored and made available to the workers responsible for processing whatever information the document holds.
Two particular benefits are realized when using a document image processing system. First, by electronically processing a document, the shuffling of documents between workers can be accomplished by routing the image to workers' workstations via local area networks. This saves the expense of having to manually transfer documents from worker to worker, thereby reducing the labor necessary to process a document.
The second advantage is that multiple workers can process an image concurrently if the image is made available at each worker's workstation. Concurrent processing of a document would foreseeably increase the overall throughput rate for the document processing staff.
One of the many areas where document image processing has proven successful is in the processing of checks done by banks and other institutions. Generally, early check image processing systems first formed a digitized image by optically scanning the document, then routed the check image to an operator workstation. The operator, working from the check image, provided data entry to the system processing the checks. The overall check processing system then accordingly adjusted the appropriate accounts based on the data entered by the operator.
More recent check image processing systems have improved upon the early systems by providing the capability to automatically read the courtesy amount from a check. The newer systems first locate the courtesy amount on the check image, and then employ character recognition techniques to determine the amount of the check. Thus, for each amount that can be automatically recognized, manual entry of the check amount is unnecessary, and the benefits of a greater throughput rate and a reduction in labor expenses can be realized.
As good as the latest check image processing systems are, problems still arise which require human intervention. In particular, character recognition techniques employed by today's courtesy amount readers are not infallible. Sometimes the character reader is unable to identify a handwritten character, other times the character reader may incorrectly identify a character. Both cases require human intervention: either to enter the data from the courtesy amount shown on the check, or to correct the amount misidentified by the character reader.
In addition to character recognition errors, there may be other sources of errors where a check image processing system processes deposit transactions. A deposit transaction usually includes a deposit ticket enumerating the checks to be deposited along with a total of the amounts indicated on the checks, and the checks to be deposited. The checks and deposit ticket are typically referred to as "items" in the transaction. Where the deposit amount is different from the total of the checks recognized by the system, the transaction is said to be out-of-balance. The out-of-balance condition may be caused by an addition error made by the depositor, a check which is missing from the transaction, or a misrecognized amount. If the deposit includes a large number of checks, identifying the source of the error may prove especially difficult and time consuming.
For out-of-balance transactions, there are at least three possible courses of action. First, the system could force an operator to manually rekey the courtesy amount for each check image; second, the system could attempt to automatically balance the transaction; or third, the system could present the all the information involved in the transaction to a balancing operator for further analysis.
Where an election is made to have an operator rekey the courtesy amount for each check, each check image is presented to the operator and the courtesy amount for that check is manually entered. The system then checks to see if the amount rekeyed brings the transaction into balance. If the transaction balances with the new amount, then further manual entry of data for the remaining check images is unnecessary.
If the system has an automatic balancing feature, the entire transaction may be presented to an automatic balancer in an attempt to balance the transaction without operator intervention. The automatic balancer may employ expert system techniques and be driven by information associated with the transaction.
The third way to balance the transaction is to present all the information surrounding a transaction to a balancing operator in hopes that the balancing operator will be able to recognize the check or deposit amount which is causing the unbalanced transaction. The check images, the deposit ticket, the recognized check amounts, and the calculated debit and credit totals are available for viewing by the balancing operator.
It will be recognized that the major difficulty in balancing an out-of-balance transaction is identifying the source of the error. This may prove especially difficult if there are a large number of checks involved in the deposit transaction. Thus, it is desirable to identify for the operator those checks which are the most likely source of the error in hopes that the item in error can be identified more quickly and the transaction processing completed sooner. The following patents discuss various ways in which the foregoing problem has been addressed
U.S. Pat. No. 5,040,226, entitled "Courtesy Amount Read and Transaction Balancing System", awarded to Elischer et al., discloses one technique for assisting in identification of the source of error in an out-of-balance transaction. The numeric fields in the courtesy amount "are subjected to character recognition analysis, and a confidence level is associated with each such numeric field reflecting the degree of confidence with which the apparatus has recognized the numeric dollar amounts." An overall confidence level is calculated for the courtesy amount based on the respective confidence levels for each numeric field in the courtesy amount. '226 suggests multiplying the respective numeric confidence levels to obtain the overall confidence level. Once the overall confidence level is calculated for the courtesy amount on each check image, courtesy amounts can be presented to a balancer in the order of lowest to highest overall confidence level. The system of '226 discloses determining an overall confidence level for a courtesy amount based on the confidence of each individual numeric character recognized by the character reader and using this as a basis for balancing the transaction.
U.S. Pat. No. 5,040,227, entitled "Image Balancing System and Method" and awarded to Lyke et al., discusses a method which "allows a balancing clerk or operator to efficiently find and correct various errors in deposits which are out-of-balance." In contrast to '226, '227 discusses highlighting documents having a probability of error meeting a "certain threshold." Similar to '226, '227 bases is probability calculation for each document on the level of certainty with which the character recognition apparatus was able to identify each character making up the courtesy amount.
Both '226 and '227 calculate the overall confidence level for a figure read from a document image based only upon the individual confidence levels for each of the characters recognized by the automatic reader. While this method is certainly more useful than giving no indication as to the item which may be suspect, in many instances the overall confidence level for a particular item may not be a reliable indicator that the item is the source of the out-of-balance condition.
For instance, if a first item in a transaction has three digits, each of which is recognized with a respective confidence level of 0.9, 0.7, and 0.9, the overall confidence level as computed by the method discussed above would be (0.9*0.7*0.9)=0.567. If a second item in the transaction has six digits, each of which is recognized with the respective confidence levels 0.9, 0.9, 0.9, 0.9, 0.9, and 0.9, the overall confidence level for the second item would be (0.9*0.9*0.9*0.9*0.9*0.9)=0.530541. Since 0.530541 is less than 0.567, the second item would be presented to the balancing operator before the first item, according to '226. Clearly, this is not the most desirable order because the first item has a digit with a lower confidence level (the digit with a 0.7 confidence level) than any of the digits in the second item. It would be more desirable to present the first item to the balancing operator first in light of the digit with the lowest confidence level.
Using '227's approach, the confidence level for each item would have to fall below a predetermined threshold before the item would be displayed to the balancing operator first. If the chosen threshold falls between 0.567 and 0.530541, say 0.55, the item with the confidence level of 0.567 would not be highlighted as a suspect item, even though it has a digit that was recognized with a lesser confidence level than any of those digits in the second item.
In following either the approach suggested in '226 or '227, the desired result is less than optimal. There may be many more instances where items which are not the source of the out-of-balance condition are presented to the balancing operator before the item which is actually the source of the error. This may be more common in transactions which includes hundreds of items. Therefore, the methods suggested above may not be useful in increasing the overall throughput of the check image processing system.
An alternative approach to presenting transaction items to a balancing operator is briefly discussed in U.S. Pat. No. 5,120,944, entitled, "Image-Based Document Processing System Providing Enhanced Workstation Balancing," and awarded to Kern et al. '944 discusses making a determination as to whether there are suspect items that could be causing the out-of-balance condition, such as character transpositions, shifted digits, missing digits, etc. After the suspect items are identified, the suspect items are presented to the balancing operator ahead of the non-suspect items. The balancing operator can then focus on the suspect items first in order to determine which items may be causing the out-of-balance condition.
While the approach suggested by '944, similar to '226 and '227, is certainly better than not providing a balancing operator with any indication as to which item in the transaction may be the cause of the out-of-balance condition, '944's approach is likely to possess drawbacks which are similar to the limitations identified for '226 and '227. In particular, if a transaction includes a large number of items, there is a possibility that there will be multiple suspect items according to the criteria suggested by '944. If the number of suspect items becomes too large, the advantage of presenting the suspect items first soon evaporates due to the fact that each suspect item must be examined in turn to identify the item in error.
The present invention provides a method which is better suited to quickly identify the source of an error in an out-of-balance transaction so that the balancing task can be completed more efficiently.