The present invention relates generally to automatic identification of an address field on a document.
In automated mail inserting it is common to process documents that are to be mailed in accordance with markings on the documents. Conventionally, such markings include optical marks which have predetermined meanings, such as indications of stations from which enclosures have to be added selectively, an indication that the sheet is a last or a first one of a set of sheets to be gathered. Automated mail inserting systems typically include an OMR reading station for reading such optical markings. Also other types of markings, such as barcodes can be used for this purpose. Such barcodes can also be used for tracking and tracing purposes.
It is however preferred to avoid the need of including markings specifically included for automatic mail piece preparation, since such markings disturb the visual appearance, distract from the contents of the document and require space to be kept free from other markings, which reduces the freedom of graphic design. Moreover, generating and applying such marks can be a complicated and costly operation.
With the advent of modern scanning techniques, it has become viable to scan the documents (which may be single sheets or sets of sheets) to be processed and use markings of the contents that are not specifically included for mail preparation purposes to identify the documents and/or the preparation steps that have to take place. A part of a commercial mail document that is specifically suitable for use as a basis for determining processing steps and for tracking and tracing is the address to which the document is to be sent, since the addresses of a series of documents of a mailing are typically at generally identical locations and are all unique to the respective document.
Acquiring address information from each document requires that the addressee information is read from the documents. One known approach is to use special marks on the documents to identify the area of interest in which the addressee information is present, but this requires special applications to generate the marks and/or the marks occupy a portion of the document and distract from an ideal personalized presentation. Other techniques for locating the address include heuristic algorithms based on standard templates, and neural network technologies. However, automatically identifying an address in the entire contents of a document would be a complex operation which is an important disadvantage in the field of automatic mail inserting systems which operate at high speed and in which available computing power is limited in view of cost restraints.