The processing and handling of mailpieces consumes an enormous amount of human and financial resources, particularly if the processing of the mailpieces is done manually. The processing and handling of mailpieces not only takes place at the Postal Service, but also occurs at each and every business or other site where communication via the mail delivery system is utilized. That is, various pieces of mail generated by a plurality of departments and individuals within a company need to be addressed, collected, sorted and franked as part of the outgoing mail process. Additionally, incoming mail needs to be collected and sorted efficiently to ensure that it gets to the addressee (i.e. employee or department) in a minimal amount of time. Since much of the documentation and information being conveyed through the mail system is critical in nature relative to the success of business, it is imperative that the processing and handling of both the incoming and outgoing mailpieces be done efficiently and reliably so as not to negatively impact the functioning of the business.
It is known to use Optical Character Recognition (OCR) scanning for name and address matching on a mail processing apparatus to verify that a valid address is visible on the envelope and that this scanned name and address matches an expected value, to verify mailpiece completion by comparing the OCR read information to an addressee database in order to determine the appropriate destination points for delivery of the mailpieces. However, OCR scanning and processing on mailpieces is generally error prone due to missing characters, substitution, untrained characters and poor imaging among other causes.
Missing character errors are mainly caused by a failure to separate characters. Extra character errors can be caused by improper separation, speckles in white space, untrained characters, or a change in font. Substitution errors can be the result of untrained characters (usually substituted with ‘?’), poorly trained characters, or change in font. Any of these processing errors can also be caused by poor image conditions such as poor focus, lighting, or contrast or bright reflective spots in the image. In addition OCR processing systems rarely process spaces, however names and address always have spaces. Combinations of missing characters, extra characters, and character substitution errors would result in many falsely mismatched mailpieces. As a consequence, all of the above factors made mailpiece verification of names and/or address using OCR methods impractical.
A prior art system for guessing at the intended recipient of an unidentifiable mailpiece is disclosed in U.S. Pat. No. 6,796,433 assigned to the same assignee as the present invention. In this prior art system, the processing attempts to identify the intended recipient of rejected mailpieces by matching some identified portion of the OCR read addressee information from specific fields such as the addressee name field or the addressee location field to corresponding information in a company specific keyword database containing information relating to the fields contained in the addressee database. If the field information matches the identity of the recipient is concluded to be the recipient identified through the matched field information.
Other known prior art name and address matching systems and methods would typically only declare a match if the OCR character string matched exactly to the expected character string. The majority of failures to declare a match can be traced to space differences between characters and OCR processing errors. The known prior art matching systems and methods are not satisfactory in that the scanned character string and the expected character string had to match exactly to declare a match. The comparison process is time consuming and slows down throughput on mailpiece processing apparatus. Further, minor errors that had no consequence on the validity of the match between the scanned character string and the expected character string resulted in a rejection of the mailpiece.
To improve OCR address matching, processing results need to be more accurate and consistent. Since the fonts that are used on most mailpieces are not ideal for OCR scanning, some errors must be expected. Higher resolution cameras, better image conditions (lighting, focus, contrast, etc.), and advanced OCR algorithms running on more powerful computers would all improve the OCR scan results. However, these are not cost effective options to significantly reduce OCR processing errors.
Accordingly, it would be desirable to provide a method of verifying that an intended addressee on a mailpiece matches a valid addressee in a mail processing apparatus using an OCR system that overcomes the disadvantages of known prior art matching systems and methods, improves accuracy and speeds throughput of mailpiece processing equipment.
It would also be desirable for a user to set a confidence level to conclude that an OCR scanned intended addressee having minor errors is that of a valid addressee.