Many document generation and editing systems for generating and editing, and document management systems for storing and managing generated or edited documents have conventionally been proposed. In such document management systems, a stored document can be retrieved, and the retrieved document can be printed by a printer.
Lately, as the computers and networks develop, a large amount of electronic documents are stored in a computer system. Along with this situation, the amount of data managed in the document management systems is enlarging. In view of this situation, there are increasing demands for a retrieving technique that can retrieve a target document from a document database holding a large amount of document data.
Many retrieving techniques have been proposed, including a simple retrieving method which performs a search by designating the name of a file or a document number, a keyword search which performs a search by using a keyword given to each document in advance, a full-text search which performs a search by using an arbitrary term included in contents of the document, a concept search which performs a search based on a conceptual feature of the content of the document, and so forth. Many of these retrieving techniques perform a search by using a text inputted from a keyboard as a query.
Besides these retrieving techniques using a text query, a technique that uses a paper-printed document as a search query has been proposed. In this technique, a paper-printed document is scanned for reading the document data, and the original electronic document of the printed document is searched using the read data (using the data as a query). In this specification, such retrieving technique will be referred to as an original document search.
For instance, according to the original document search disclosed in Japanese Patent Application Laid-Open (KOKAI) No. 8-263512, a document printed on printing paper is read by a scanner, digitalized and subjected to character recognition, then a user designates as the scope of the search a character string from the character recognition result, and a document that finds a match in the content and its positional relation is retrieved.
Although the technique disclosed in Japanese Patent Application Laid-Open (KOKAI) No. 8-263512 is proposed for retrieving an electronic document based on a paper-printed document, a problem still remains in that the user must designate a character string to be used as a query after the document is scanned and subjected to character recognition, and that the user must bear the cumbersome operation of designating a scope of the search. It is of course possible to designate the entire document as the scope of the search, but in this case, character strings of the entire document must be subjected to matching. Taking a character recognition error into consideration, this matching ends up with an ambiguous matching and imposes an enormous processing load. Therefore, realistic response performance cannot be expected.