1. Field of the Invention
The present invention relates to an image processing apparatus, an image processing method, and a storage medium by which electronic document data for searching an object in a document is generated.
2. Description of the Related Art
Conventionally, an approach has been considered to provide a search of an object included in a document, such as a photograph, a drawing (line drawing), or a table. (The term “object” is used herein to include an object other than characters such as a photograph, a drawing (line drawing), or a table.)
For example, a method exists by which, in the vicinity of an object extracted from a document, a character string describing the object (caption) is added and this is associated as metadata so that the object can be searched.
When a caption in a general document includes an expression for identifying the object, such as a drawing number (e.g., “photograph 1”, “first drawing” or “table 1”) (hereinafter referred to as “anchor expression”), a more detailed description of the object is also described in a body text using the anchor expression. The anchor expression as described above has been also used as a means for identifying an object in a document. According to the invention disclosed in Japanese Patent Laid-Open No. H11-025113 (1999), an explanatory part of a body text including an anchor expression (hereinafter referred to as “explanatory text in body text”) is extracted and is associated as metadata for the object. When a caption adjacent to an object of a drawing includes an anchor expression “Fig. 1” and a body text includes the explanation “the Fig. 1 is AAA” for example, an anchor expression “Fig. 1” is associated as identification information of the object of the drawing. At the same time, the explanatory text in the body text of “the Fig. 1 is AAA” is also associated as metadata, thereby providing a search by the metadata of the object of the drawing.
In recent years, some word processors, for example, have an editing function such as a function to automatically generate an anchor expression and a function to associate an object existing in a document with an explanatory text in a body text. The information given through these functions (metadata) can be stored in an electronic document to thereby realize the efficient editing of the document.
Scanners in recent years have a function such as an auto document feeder and thus can read many pages of papers easily. Thus, such scanners also can simultaneously read a plurality of types of documents. When such a scanner must read a mixture of different documents on the other hand, there is a possibility that a plurality of objects are read that have captions using an identical anchor expression. For example, there may be a case where one of the simultaneously-read documents has a table object having the caption “the table 1 is YYY” and another document thereamong has a table object having the caption “the table 1 shows ZZZ.” If the above-described association processing is simply performed in such a circumstance, the anchor expression for the same “table 1” is associated with the two table objects, thus failing to provide an explanatory text in a body text that appropriately corresponds to the anchor expression for “the table 1.”
Due to the above situation, such a method has been desired by which, even when a plurality of types of documents must be read and a plurality of captions use an identical anchor expression, a caption or an explanatory text in a body text can be appropriately associated as metadata with the object.