1. Field of the Invention
The present invention relates to an image processing apparatus and an image processing method for creating electronic document data whose objects can be searched from a document image.
2. Description of the Related Art
Conventionally, to easily use objects other than characters (for example, photograph, drawing, line drawing, table, and the like) in a document image, a technique for enabling the object to be searched is known. In the description below, the “object” indicates objects other than characters unless otherwise stated.
In Japanese Patent Laid-Open No. H11-306197 (1999), an object such as a drawing, a graph, or the like is extracted from a document image, and it is determined whether or not there is a caption character string (a character string explaining the object) near the object. When there is a caption character string, the caption character string is associated with the object, so that the object can be searched.
When the caption adjacent to the object is a figure number (for example, “FIG. 1”, “FIG. 1”, and the like), in a general document image, a character string of the same figure number is written also in a body text to explain the object. In other words, the same expression as the figure number written in the caption is also written in the body text. Japanese Patent Laid-Open No. H10-228973 (1998) discloses a technique for automatically creating a link between a figure number in a caption and a figure number in the body text to form a hypertext. In this technique, for example, when a figure number “FIG. 1” is included in a caption adjacent to an object and a sentence “FIG. 1 is AAA” is present in the body text, a hyperlink is created between the caption “FIG. 1” and the “FIG. 1” in the body text. Japanese Patent Laid-Open No. H10-228473 (1998) also describes that a link is automatically created between an object and a body text related to the object, and a hypertext document is created.
When objects with which metadata is associated are JPEG-compressed or the like and stored in one electronic document, the objects are created as one electronic document with a small amount of data. When such an electronic document is used by an application, it is possible to search for an object from the metadata using a caption character string as a search keyword.
On the other hand, in an electronic document in which caption character strings are given to objects respectively as metadata and objects other than characters can be searched for, when a keyword search is performed, it is desired that an object as a result of the search is highlighted.
However, the objects to be searched are photographs, drawings, tables, and the like which have various colors and shapes. Therefore, because highlighting of the object is inconspicuous due to the original color and shape of the object, it may be difficult for a user to identify an object hit in the search. For example, when a highlighting method by which a contour of a searched object is colored with a red circumscribed rectangle is used, if a rectangle photograph object including much red is hit in the search, highlighting of search result is in contact with the photograph area in the same color as that of the photograph, so that the highlighting is inconspicuous. Therefore, it is very difficult for the user to identify the object hit in the search. In addition, when the size of the object is small, or there is a plurality of objects adjacent to each other, it is impossible to intuitively identify the object hit in the search, and hence there is a problem that an improvement of search efficiency cannot be expected (refer to FIGS. 17A and 17C).
Therefore, in a process of creating electronic document data in which objects other than characters can be searched with a keyword, an electronic document data creation method is required which, in searching, performs highlighting of the object so that a user can easily recognize it while maintaining the shape and data state of the object to be searched.