1. Field of the Invention
The present invention relates to a method, an apparatus, and a program for data retrieval operations.
2. Description of the Related Art
Current multifunction peripherals (MFPs) are becoming more and more sophisticated in functionality. More specifically, MFPs enabling not only copying and printing, but also scanning and faxing are now mainstream. Such an MFP plays an important role as a port for input and output of paper information and electronic data in offices.
In particular, the scanning capabilities of MFPs allow efficient conversion from large amounts of documents in paper form to electronic data. Therefore, documentation management by scanning documents quickly and storing produced data in a database on a network or in a database contained in an MFP is increasing. (Hereinafter, data produced by converting paper documents into electronic form by scanning is referred to as “document image”).
Due to the spread of personal computers (PCs), creating documents is usually realized by the use of applications in PCs. (Hereinafter, electronic documents created by the use of applications are referred to as “electronic documents”.)
Compared to documents printed on paper, document images and electronic documents are advantageous in that data can be reedited and reused, data can be easily shared by many people, data does not suffer degradation over time, and the like. On the other hand, paper documents have the advantages of ease of reading and handling, convenience of portability, intuitive understandability, and the like. Therefore, there are many instances where outputting electronic documents or document images in paper form is efficient usage.
It also often happens that document images or electronic documents identical with or similar to paper documents are required to reedit or reuse data.
Examples of such an instance include a situation in which new material is required to be created based on a paper material distributed at a meeting and a situation in which printing out the original data is required since a paper document is partly damaged. Under the present conditions, however, a person who needs material must contact another person who prepared the material to receive an electronic document of the material in most cases. In other words, people cannot obtain an electronic file without contacting other people in most cases. This impairs ease of reuse of documents.
Accordingly, a technique for quickly retrieving a document image or an electronic document by a paper document in order to efficiently perform documentation management is necessary.
In general, documents in offices and homes include text, photographs, figures, or tables, or a combination of these. When such document information is stored in a database, examples of possible retrieving methods include the following:
1. Retrieving relevant data by a word or phrase functioning as a search key;
2. Retrieving similar image data by an image feature amount of the entire document image, the image feature amount functioning as a search key; and
3. Where information added to a document image is set as a search key, retrieving data containing the information.
Search key represents digital data of a word, a phrase, graphics, and the like functioning as an index for information retrieval.
For partial data retrieval, a divided region of a photograph, text, or the like within a document image is set as the search key.
Japanese Patent Laid-Open No. 2002-297610 discloses a retrieval technique by determining a logic operator corresponding to one or more user-specified regions in an image.
Japanese Patent Laid-Open No. 10-289251 discloses a retrieval method including the steps of preparing keyword icons graphically representing individual data sets stored in a database, displaying a search-condition setting window, and placing at least one keyword icon in the search-condition setting window to set the search condition.
For searching data including a plurality of attributes, optimal search methods vary with the attributes. Therefore, a retrieval operation is rendered complicated and difficult. As a result, a series of data retrieval operations become too burdensome for users who are inexperienced at computer operations. Thus, there is a need for a technique for facilitating such an operation.