1. Field of the Invention
The present invention relates to a reading apparatus for reading data entered in a form of a predetermined layout, and a data processing system for reading, transmitting and printing the data entered in the form of the predetermined layout.
2. Description of the Related Art
Generally, in the case of data management by the use of forms or the like, data is entered in a predetermined form provided with an entry box. When data entered in a form is transmitted between apparatuses in a data management system organized in a network, a method is frequently used in which the data of the box is deleted in order to reduce the data transmission amount and only the entered data is transmitted in order to reduce the data transmission amount.
Data entry into a form provided with a printed entry box is frequently performed by handwriting and stamping. When data thus entered in a form is transmitted between apparatuses, the entered data is read by a reading apparatus such as a scanner, character recognition is performed, and then the read-out data is transmitted to another apparatus. There have been proposed various methods for deleting entry boxes printed on a form in order to reduce the data transmission amount as described above. For example, Japanese Unexamined Patent Publication JP-A 6-290296 (1994) discloses a technology associated with a character recognition apparatus in which entry box layout data is previously stored in a form layout registration portion and an exclusive OR operation of the entry box layout data and image data read by a reading apparatus is performed to delete entry boxes. By using this apparatus, reduction of data transmission amount and enhancement of character recognition precision can be realized.
However, in the structure of the apparatus disclosed in JP-A 6-290296, since the exclusive OR operation is performed to delete the entry boxes, in the case where there is some blur or spread in the entry boxes of the form, the blurred or spread part cannot be deleted and remains because of the difference from the previously stored layout.
Moreover, when data is entered into a small entry box for data entry by printing with a printing apparatus by handwriting, the data is not always written completely within the entry box. That is, it occurs that some of the written characters contact with the entry box or partly lie off the entry box. When the entry box is deleted from the read out image data including such characters by the method of the publication, the parts of the characters that fall on the entry box are also deleted, so that some parts of the characters are missing. Consequently, the character recognition precision decreases.
As a method of handling data which is partly outside the entry box, for example, Japanese Unexamined Patent Publications JP-A 7-175891 (1995) and JP-A 10-222606 (1998) disclose a method for increasing the character recognition precision of characters which are partly outside the entry box or contact with the entry box by precisely cutting out parts of the characters outside the entry box and correcting the missing parts of the characters due to the deletion of the entry box.
In recent years, since electronic documentation has been advanced because of adoption of office automation (OA), it has been common practice that data input by use of a data input apparatus such as a personal computer is managed in a unified way by a storage apparatus and data entry (printing) into a form is performed by use of a printing apparatus such as a printer. Therefore, placing more importance on the browsability of the data than on the convenience of data entry, entry boxes of the form are frequently small. However, even in recent years, not all the data has been entered by a printing apparatus and there have been cases where data is entered by handwriting. Therefore, in this case, a form in which data has been entered is read by a reading apparatus and character recognition is performed to perform data input; however, in layouts premised on the use of a printing apparatus such as a printer as described above, the entry box is frequently too small for data to be entered therein by handwriting. For this reason, it is apt to occur that written characters are partly outside the entry box or are in contact with the entry box.
This problem can be solved by using the technologies disclosed in JP-A 7-175891 and JP-A 10-222606. However, the program of the apparatus becomes complicated, which increases the cost of the apparatus.