The invention relates to a method for capturing a complete data set of forms provided with graphic characters, wherein the form layout contains several separate data fields whose spatial position within the form layout is identical for all forms, having the following steps.                Producing an image of the form and saving the image data of the individual data fields.        Based on the image data of the data fields and with the aid of a character recognition program, identifying the graphic characters contained in the data fields inasmuch as they are identifiable with a predetermined degree of certainty.        Determining the unidentified data fields, i.e., those data fields of the form whose graphic characters could not be identified at all or could not be identified with the predetermined degree of certainty.        Transferring information in regard to the data fields to an external evaluation station, preferably by means of a global data net.        In the external evaluation station, identifying completely the graphic characters of the unidentified data field based on the information in regard to the data fields.        Transferring the graphic character identifications carried out in the evaluation station for further use.        
When processing documents and, in particular, forms, there is often the task of translating a manually written word or graphic character into computer language. For this purpose, the document in question is converted by means of a scanner into electronic images in the form of image data. By means of a suitable image recognition software, it is then attempted to translate this image data into computer characters in order to determine, based on the computer characters, the contents of the written words or graphic characters. The reliability of the capture of graphic characters depends greatly on the quality of writing as well as the image quality of the document to be captured. A primary parameter affecting this is the quality of the writing, for example, the quality of lettering done by hand, but also of a machine-generated writing produced by a typewriter. Also having an effect is the image sharpness, i.e., the separation between the individual graphic characters and the usually light-colored image background; moreover, the translation quality of the levels of greyscale into black/white and also a possible soiling of the document. All of these factors can have an impact in regard to whether the character recognition program recognizes the graphic character or not. A progression of non-recognition is a faulty recognition. Based on a supposedly recognized graphic character, a nonsense graphic character is interpreted.
In connection with the voluminous capture of hand-written forms, as it is, for example, typical for processing medical prescription forms, as a result of graphic characters that are not recognized at all or are wrongly recognized, considerable expenditures are incurred for after processing, i.e., manual capture of those forms that cannot be recognized or recognized only incompletely by means of a character recognition program. This concerns primarily also forms which are filled out in non-segmented writing, i.e., cursive handwriting.
A method with the method steps set forth above is known from U.S. Pat. No. 5,305,396. It concerns a correction method for recognizing written forms wherein letters or graphic characters which are not recognized or recognized with uncertainty are determined in several steps iteratively. This can be carried out in particular also at a spatially removed evaluation station, for example, by using the global data net. First, the individual image data of the form are saved in accordance with the data fields of the form. Based on the image data by means of a character recognition program, identification of the graphic character is performed inasmuch as such identification is possible with satisfactory certainty. The coordinates of characters which are not recognized or not recognized with sufficient certainty are then recorded in a machine-generated data structure (MGDS). The data of the MGDS are then transmitted to an external evaluation station. Here, the graphic characters are completely identified, and the MGDS is supplemented by the corresponding repair information. In the method according to U.S. Pat. No. 5,305,396, a single complex data structure is used which accumulates the “repair history” for all concerned fields, respectively, which is made available at the end of evaluation. Such a method is unsatisfactory with respect to data privacy because the confidentiality of the information contained in the forms is not ensured, in particular, because access to the entire complex data structure is possible.