1. Field of the Related Art
The present invention relates to a document management apparatus and a document management method for managing a document image.
2. Description of the Related Art
Traditionally, there has been a demand that a paper document should be taken into a computer as a document image from an input device such as a scanner and the content of the taken image should be converted to electronic data and thus reused.
In this case, a system is typical used in which layout information or text information as a result of OCR is extracted from the electronic document image information provided by converting the paper document into the electronic data, then the content is semantically analyzed to extract semantic area information, the respective pieces of information are associated with each other and registered as meta data in a database, and these data are stored in a state where they can be searched for.
Also, a system has been known in which the type of the original is determined on the basis of the type of extracted information or the position on the document image, the document image information is converted to a structured document, and the storing place or the transmission destination of the document is decided. See JP-A-2005-43990.
However, in such traditional systems, for example, in the case where the user wants to change the meta data information to be extracted by automatic processing, the document images of all the input originals need to be analyzed each time to change the registration of the meta data. This will be a burden to the processing time and the user's operation, and the user cannot easily change the meta data. It lowers the convenience.