Example embodiments discussed herein are related to a recording medium including a logical-structure-model storing section that stores the logical structure indicating logical elements in various documents and a relation between the logical elements and having recorded therein a document recognizing program for recognizing the logical structure of an inputted and recognized document according to the logical structure stored in the logical-structure-model storing section, a document recognizing apparatus including the logical-structure-model storing section, and a document recognizing method for the document recognizing apparatus.
Conventionally, there is a document data input system that prepares, for each form of a document, a layout definition describing position information of data desired to be extracted and recognizes, with an OCR, tagged data using the layout definition after identifying the form of the document. Specifically, plural sets of two coordinates representing a tag name a rectangular area are written on a document. For example, a tag corresponding to data “Fujitsu Taro” is “name of a principal” of an educational institution such as a school.
This document data input system displays, side by side, an inputted document image and a recognition result obtained by recognizing the document image using the layout definition. A user compares the document image and the recognition result and determines whether the recognition result is correct. When the recognition result is wrong, the user deletes the recognition result once and inputs a correct value with a keyboard or the like.
However, with such a method, because the user compares the document image and the recognition result and determines whether the recognition result is correct, a burden on the user is large in terms of reading. Moreover, artificial mistakes may not be prevented. Therefore, various techniques for reducing a burden of data correction work when there is an error in a read document in such a data input system have been disclosed.
For example, a data input system that automatically generates a layout image of a document corresponding to a place of an error that occurs in document recognition processing is conceivable. Specifically, the data input system analyzes, according to layout information for designating a layout of a document to be read, a layout of a document image of a read document and performs character recognition of respective reading objects determined by this layout analysis. The data input system detects a layout analysis error from result data of this character recognition and the layout information and screen-displays a document image corresponding to the error occurrence place.
However, the technique described above has a problem in that a burden of the data correction work may not always be reduced and, to the contrary, the burden may increase.
Specifically, because the document image corresponding only to the error occurrence place is displayed, when this document image is enlarged and displayed, it may be unclear as to the location in this document where, i.e., at which character string, the document error is. In particular, when headings of the same character string are present in the document, those character strings have to be distinguished. As a result, the user has to reduce and display the enlarged image once to make it possible to check an image around the image. This increases the burden on the user to the contrary.
With the technique described above, the error in reading the document is simply displayed to the user. It is impossible to display to the user whether the displayed error is properly corrected. In other words, even if the user manually corrects the error occurrence place on the basis of the document image corresponding to the error occurrence place, content of the correction is not always correct. The user himself/herself needs to visually check the correction content. Therefore, it is hard to say that it is possible to reduce the burden of the data correction work.