The invention concerns a method and a device according to the preambles to the independent claims and can be used with particular advantage for determining delivery data on mail items, written with non-alphabetical characters.
Systems for the automatic reading of delivery data, particularly addresses (OCR), are well known in the field of letter processing and are described, for example, in the German Patent 195 31 392. Processing rates of 10 letters per second, meaning 36 000 letters per hour and more, can be achieved with modern OCR letter sorting machines. However, the recognition reliability varies strongly, depending on the type of writing and the total quality of the address information affixed to the surface of the letters. In case of a successful recognition, the respective letter can be provided with a machine-readable bar code. This bar code permits a further mechanical processing up to an optional, desired sorting order. In particular, the use of bar codes permits a sorting of the letters up to the sorting level of the mail route, for which the letters are sorted based on the sequence in which the mailman delivers them.
Economic trends and an increasing postal volume in the Asian region have led to increased efforts in the automatic recognition of eastern characters in order to limit costs and to improve the mail service. In the process, the recognition systems must meet new requirements, relative to the situation in western countries where postal automation is already an established technology. These requirements stem from the fact that in most countries of the Asian region, Chinese characters are used for local postal addresses. In contrast to the letters of western alphabetical languages, Chinese characters are configured as ideograms. Each of these ideograms can represent a word. Instead of an alphabet containing thirty to sixty letters, 3,000 to 6,000 different Chinese characters are used daily, each with its own characteristic form. The virtual incompleteness of the Chinese character system and the ideographic structure of the individual characters lead to a reduced effectiveness of OCR systems, as compared to Western alphabetical systems of writing. In addition, problems result from the fact that addresses on postal items are oriented either in vertical or horizontal direction and that frequently a mixture of Chinese and western characters is used.
Since the recognition rates for automatic reading systems in general vary greatly for western as well as Chinese characters, these systems must be supported by using various forms of manual intervention. A manual sorting operation is the simplest type of intervention for rejected letters that cannot be read automatically. However, the resulting costs are uneconomically high as a result of increasing labor costs. Added to this is the fact that such manually sorted mail items cannot be sorted further mechanically without problems, not even at a later point in time, so that two separate mail item flows are generated, which must then be combined again manually at a specific point in time.
To avoid these disadvantages associated with the manual sorting of OCR-rejected mail items, various methods for a manual coding of postal goods have been developed. All of these methods make use of operator intervention to affix bar codes to the mail items in a manner that is consistent with the requirement of carrying out a mechanical sorting with the same machines that process the OCR-read and bar-coded postal goods.
Another method for coding rejected postal goods uses so-called manual coding stations. At these manually operated coding stations, the mail items are physically presented one after another to an operator. The operator then encodes each of these mail items with as much data as needed to clearly identify the delivery destination. In the process, the input address is converted to a sorting bar code with the aid of a directory and this bar code is affixed to the mail item. The coded mail items are then processed further with the aid of bar code sorters (BCS), which are mechanically identical to OCR-compatible BCS. The US Postal Service and the Royal Mail first introduced manual coding stations of this type in the 1970""s. The main disadvantages of such devices are that mail items must be removed from the OCR flow of mail items and the ergonomic difficulties experienced by the operator during the recognition of mail items that are transported past.
The next improvement in the processing of OCR-rejected mail items was the development of on-line video coding systems (OVS). In an OVS, a video image of the mail item and not the physical item itself is presented to the operator at the manual coding station for the coding. The video image is shown to the operator while the physical mail item is held in a delay loop. Normally, the mail item is kept moving in these delay loops for a period of time that is sufficient for the OVS operator to input the necessary sorting data for the respective image. The standard delay loops permit a delay of between 10 and 30 seconds. The longer the delay loop, the higher the costs and requirements for maintenance and physical size of the facility.
The main problem when using the OVS is that the available time is only sufficient for a careful input of the zip code (ZIP) or the postal code (PC), unless impractically long delay loops are used.
As long as a ZIP or PC exists, OVS can also be used effectively for mail items having Chinese characters in the address. However, the share of such mail items is very low in many eastern countries and will remain low in the foreseeable future. For that reason, special coding methods were developed to keep the on-line delay time as low as possible.
In prior art, various method have been developed to increase the coding productivity and/or make it possible to list all address elements, meaning ZIP/PC, street/post office box, addressee/post office box, addressee/firm. Essentially these include the following:
Preview Coding
A simultaneous display of the images of two mail items, one above the other, is used for the preview coding. The lower image here is the active one, meaning the one for which data are encoded. Following a suitable training, operators can encode the information on the lower image while simultaneously recognizing visually the address information on the upper image. Subsequently, the upper image becomes active and the process is continued. The preview coding makes it possible to double the operator productivity through a complete overlapping of the cognitive and motor functions during the coding of successive images.
Extraction Coding
Owing to the fact that with the practically achievable on-line delay times, only the ZIP/PC address elements can be input reliably by the operator, certain key components of the address components that refer to the street are input during the extraction coding. The extraction coding normally is based on specially developed rules, for which a code with fixed length is used as access key to an address directory. For example, the Royal Mail uses an extraction formula, based on the first three and the last two letters. The operator must memorize special rules for this to avoid superfluous address information and to take into account certain differentiating characteristics, e.g. directions such as east, west or categories such as street, lane, and road.
The extraction coding has several significant disadvantages, despite its certain effectiveness. In particular, it has complex extraction rules that frequently require taking into account the end of a street name. However, these are normally the least legible components on hand-written mail items. In addition, there is a significantly high rate of ambiguous extractions, for which the extraction code corresponds to several entries in a directory, so that it is impossible to make an unambiguous sorting decision. Furthermore, it must be taken into account that the input productivity of the operators is reduced as soon as the operator must make decisions instead performing a simple, repetitive keyboard entry.
Completion Coding
In contrast to the extraction coding, a variable input for each address to be encoded is made with the completion coding. During the address input, the address is essentially compared to an address directory until an unambiguous match is found. By displaying the rest of the address, an acceleration effect is achieved as soon as an unambiguous partial match has been identified. However, problems occur with this technique in that the operator must be supplied with an input-stop signal and the remainder of the identified address must be displayed, which leads to reduced input productivity and makes a preview coding impossible.
Theoretically, all described video coding techniques can also be used for mail items with Chinese characters, even though they remain only marginally usable due to a lack of quick input techniques for Chinese characters.
Operator-assisted OCR Technique
The US Post Office has experimented with operator-assisted OCR techniques to increase the address information to be processed on-line. In order to increase the effectiveness, this technique emphasizes that part of the address image for which the OCR recognition has failed. Since the operators are slow when deciphering missing letters and complex recognition errors also occur at times, e.g. problems during the segmenting, the operator productivity with this method is frequently lower than for a simple re-entry of the respective address.
Off-line Coding
An off-line coding system, as described in the U.S. Pat. No. 4,992,649, was recently introduced since it is not possible to achieve a sufficiently high productivity with a pure on-line coding when using any of the above-mentioned coding techniques. With this system, mail items with unrecognized addresses are provided with an additional information, a tracking identification (TID). The unrecognized mail items are stored externally while the images of these mail items are presented to operators for coding, wherein time limitations do not exist. The mail items are subsequently presented to TID readers. The TID is linked with the input address information. Based on this, a standard bar code sorting information can also be affixed to the mail item, so that the respective mail item can be processed in the same way as standard OCR-read mail items. Even though the off-line video coding method is an effective method for coding all address components, there is still a need for additional capacities for the continued processing of mail items with unread addresses and a correspondingly complex logistic.
The operator-assisted OCR techniques are also basically suitable for processing mail items with Chinese characters, but do not yet permit a quick input of such characters.
This unsatisfactory situation is further aggravated by the fact that the operator must meet relatively high demands with respect to the necessary training and required knowledge.
To be sure, a quick input of the delivery data can take place with video coding and the use of voice input units. However, the time-related problems during the coding are only shifted to the phase for selecting delivery data candidates.
The invention, specified in the independent claims 1 and 9, is based on the problem of a quick encoding of delivery data in the form of addresses affixed to mail items, particularly those written by hand with non-alphabetical characters, by using voice-input techniques that require few selections by the personnel. Filtering the candidates obtained by means of voice input with the associated, incomplete or ambiguous results of the OCR evaluation permits a quick, automatic selection of the right candidate, without entrusting the operator with this process. This can be used with particular advantage when encoding hand-written addresses with Chinese characters.
Advantageous embodiments of the invention follow from the dependent claims. A keyboard input of the numerical components of the displayed delivery data is therefore favorable according to claim 2 in order to reduce the evaluation expenditure. According to claim 3, it is advantageous for the selection of candidates from the voice input if the number of characters determined in the OCR evaluation is compared to the number of characters from the voice recognition and those candidates are sorted out, for which the number of characters deviates by more than one statistically determined limit value from the number of characters determined in the OCR evaluation. According to claim 4, the segmenting result can be used for this.
In another advantageous selection process according to claim 5, the characters determined in the OCR evaluation and provided with reliabilities are compared position-related to the characters for the voice recognition candidates and the most reliable candidate, which is located above a limit value, is selected. According to claim 6, the following process steps are preferably carried out successively to determine a street name:
Checking the existence in an address data file/street directory;
Selecting the number of characters in a comparison with the number of characters determined in the OCR evaluation and
Position-related comparison of the characters for the candidates and the characters from the OCR evaluation.
If the numerical components of the delivery information are input via keyboard, it is advantageous according to claim 7 and 8 to search for these numbers by using an OCR unit for numbers. These numbers are then used to detect the address lines, their orientation and the position of the name field in the address line.