1. Field of the Invention
The present invention relates to an arbitration device to arbitrate mail processing results from mail processing equipment and to select a best zip code for a given mail piece. More particularly, the invention relates to an arbitration device which selects a particular arbitrated result that minimizes the cost of correctly delivering the mail piece and is automatically tuned by the end user.
2. Background of the Related Art
The United States Postal Service (USPS) employs a large number and variety of automated mail processing equipment to handle parcels, magazines, and letters. A typical piece of mail processing equipment consists of a feeder to feed the mail past a camera, a camera to scan the image, a main computer to transmit the image to some number of recognition modules and directory match modules, where the same computer receives the final zip code result, a printer to spray a bar code, and a sorter to separate the mail into stackers. The equipment is assisted by a variety of recognition modules (“RM”) and directory match modules (“DMM”).
The main computer sends the scanned image of the front of a mail piece to the recognition module. This image is a pattern of pixels, either binary or gray-scale. The recognition module analyzes the image and generates a set of characters that represent its best attempt at identifying the location of the address on the mail piece and converting the pixels at that location to ASCII characters. In this way, the recognition unit is similar to Optical Character Recognition (OCR) devices used with desktop scanners to convert scanned documents into ASCII or Word files.
The directory match module receives the set of ASCII characters (“character data”) from the recognition module and attempts to find a match for the character set in a database of addresses maintained by the USPS. If the directory match module is successful in this matching attempt, it will return a valid 5, 9, or 11-digit zip code to the main computer to be sprayed on the mail piece. It may also return a 0-digit zip code, which indicates that the DMM has rejected the mail piece because it cannot match the character data to the USPS data base. DMMs also have inherent error rates; they match information from the envelope to the wrong information in the USPS data base, due to poor OCR or bad information on the envelope.
As shown in FIG. 1, the USPS employs an arbitration technique to increase the amount of mail that can be successfully processed by the mail processing equipment. In general, arbitration works by sending the same image and/or set of recognized characters to multiple, parallel recognition modules and directory match modules. Each recognition module and directory match module combination conducts an independent analysis of the mail piece and provides a zip code result. Due to the different algorithms used by the different recognition and lookup units, the results may be conflicting. Thus, there may be two or more different zip code results for the same mail piece. One, or more, or all of these zip codes may be in error.
The goal of arbitration is to select the best results from the results offered by various recognition and directory match modules, such that the overall encode rate and error rate for mail, on average, is better than if using only the results from one DMM. In turn, a higher encode rate, and lower error rate, mean lower cost of delivery. For example, assume vendor A offers a recognition and lookup system that codes 50% of the mail with a 4% error rate, and is unable to code the other 50%. Assume vendor B offers a recognition and lookup system that codes the other 50% with a 2% error rate, but not the 50% coded by vendor A. If an arbitrator selects all available answers from vendor A, and all available answers from vendor B, then the overall encode rate is 100% with a 3% error rate. Thus, the error rate of a particular DMM directly affects the cost of delivery.
An arbitrator program is invoked to select the best result or to reject the mail piece. In the embodiment of FIG. 1, the arbitrator selects a zipcode result from amongst the various zip code results, zipcode result #1 through zipcode result #n.
Current arbitration techniques have certain limitations. The automation equipment is typically manufactured by different contractors, and information about the automation equipment and its output (such as its error rate) is not readily shared between the contractors. Accordingly, one limitation of current arbitrators is that the manufacturer of the arbitrators do not have access to all information related to the automation equipment that would assist the arbitrator in making its determination. And, though the end-user of the arbitrator (i.e., the USPS), is a “trusted user”, meaning they have access to all of the proprietary error rate information, they are not well suited for incorporating that information into the arbitrator, due to lack of specific technical knowledge of the arbitrator, or data rights restriction placed on the arbitrator by the arbitrator manufacturer (typically a commercial firm). Consequently, the arbitrator may be incompatible with, or not properly interpret information from, the DMMs.
The manufacturer of the arbitrator, if a commercial firm, is limited in its ability to maximize the efficiency of the arbitrator. The cost-of-delivery information about existing directory match modules may be proprietary. Thus, existing arbitrators can not make use of cost information about existing directory match modules. For example, if a vendor A has a 10% error rate in its directory match module, vendor A ordinarily will not reveal that information to the vendor that manufactures the arbitrator, because the arbitrator vendor is often a direct commercial competitor of vendor A. However, the arbitrator vendor would need that information to tune its own arbitrator. This inability to share error rate information is a shortcoming of the existing arbitrator approaches, and the specific problem that current invention is design to correct.