1. Field of Invention
The present invention relates to an information-retrieval-performance evaluating method, an information-retrieval-performance evaluating apparatus, and a storage medium containing an information-retrieval-performance-evaluation processing program for automatically and quantitatively evaluating retrieval performance of retrieving systems.
2. Description of Related Art
As retrieval processing for retrieving desired information from a large amount of information, mainly, a very short keyword is inputted as a retrieval request, and a document or documents containing the keyword are output as a retrieval result. This method is used very widely. Hereinbelow, the method of this type is referred to as xe2x80x9ckeyword-oriented information retrievalxe2x80x9d.
Recently, however, a retrieval method of a different type has appeared. The method performs not only the keyword-oriented information retrieval, but also information retrieval by allowing input of a so-called xe2x80x9cnatural sentencexe2x80x9d composed of a character string longer than the keyword referred to here. Hereinbelow, the method of this type is referred to as xe2x80x9cnatural-sentence-oriented information retrievalxe2x80x9d.
The keyword-oriented information retrieval allows a user to input a keyword as a retrieval request, thereby retrieving information containing the keyword from a large amount of information stored in a database and outputting it. On the other hand, the natural-sentence-oriented information retrieval allows a user to input a natural sentence as a retrieval request, thereby searching for a document containing contents conceptually close to the natural sentence and, if any, outputting the document as a retrieval result.
Whichever the retrieval type, that is, the keyword-oriented information retrieval or the natural-sentence-oriented information retrieval, is used, it is required to retrieve appropriate information in response to an inputted retrieval request. As a tendency in the future, the keyword-oriented information retrieval is considered to be widely used as it has been used, but the natural-sentence-oriented information retrieval is considered to attract attention even more widely. Hereinbelow, when the terminology xe2x80x9cinformation retrieval processingxe2x80x9d is used simply or without specific modifiers, it refers to information retrieval processing that is considerably close to the natural-sentence-oriented information retrieval.
At present, there are various retrieving systems that implement information retrieval processing, as described above. However, difficulties arise in judging whether the existing retrieving systems can really produce appropriate outputs in compliance with retrieval requests made by users. That is, this implies that quantitative evaluation of performance of the retrieving systems is difficult.
A reason for the difficulty is that either a concept representing a natural sentence that a user inputs or a concept that represents a document to be retrieved according to the input, cannot be uniquely defined, and appropriateness of documents for the natural sentence must be finally determined by the user.
Under these circumstances, for retrieval performance evaluation of retrieving systems, the only way has been such as that a user checks a retrieval result produced according to a retrieval request (natural sentence) to check a degree with which the user can agree for conformity to the retrieval request that the user has input.
A typical conventional example method that performs retrieval-performance evaluation of retrieving systems is introduced below. The method is such that a plurality of completely independent retrieval requests are prepared for a plurality of retrieval-target documents, a human determines the degree of similarity of the retrieval-target documents to the individual retrieval requests according to the correlation therebetween, determines a document corresponding to a right answer of a retrieval result in compliance with one of the retrieval requests, actually subjects to retrieval, and evaluates the retrieving system according to retrieval results.
To determine an evaluation criterion in the above conventional method, however, the human must carry out various tasks, such as a task to determine the degree of similarity of the retrieval-target documents to the individual retrieval requests according to the correlation therebetween, and a task to determine a document corresponding to a right answer as a retrieval result in compliance with one of the degree of similarity of documents for the retrieval request. Also, the method causes problems in that when a completely independent document is correlated to a retrieval request, the result is apt to be subjective, in which the evaluation criterion may be inappropriate.
Accordingly, the present invention allows an appropriate evaluation criterion for performing retrieval-performance evaluation of information-retrieval processing systems to be determined simply, thereby enabling an appropriate retrieval-performance evaluation to be implemented.
To achieve the above, an information-retrieval-performance evaluating method of the present invention is arranged such that a text and a title (heading) of the text is recognized as a pair of documents, a document having the text and the title is prepared, one of the text and the title is used as a retrieval request, the other one is used as a retrieval-target information, and retrieval-performance evaluation is performed for a retrieving system according to a result of retrieval performed in response to the retrieval request input.
Also, an information-retrieval-performance evaluating apparatus of the present invention is arranged such that a text and a title of the text is recognized as a pair of documents, a document having the text and the title is prepared, one of the text and the title is used as a retrieval request, the other one is used as a retrieval-target information, and retrieval-performance evaluation is performed for a retrieving system according to a result of retrieval performed in response to the retrieval request input. The information-retrieval-performance evaluating apparatus may be configured as a storage device for storing the retrieval-target information, a retrieving device for performing retrieval based according to the retrieval request from the storage device in response to the retrieval request input, and a retrieval-result evaluating device for performing the retrieval-performance evaluation for the retrieving system according to a result of retrieval performed by the retrieving device.
Also, a storage medium of the present invention contains an information-retrieval-performance-evaluation processing program with which a text and a title of the text is recognized as a pair of documents, a document having the text and the title is prepared, one of the text and the title is used as a retrieval request, the other one is used as a retrieval-target information, and retrieval-performance evaluation is performed for a retrieving system according to a result of retrieval performed in response to the retrieval request input. The information-retrieval-performance-evaluation processing program includes an outputting step for outputting information paired with the retrieval request as a result of retrieval in response to the retrieval request input, and a step for performing the retrieval-performance evaluation of the retrieving system according to the result of retrieval performed by the outputting step.
According to the individual aspects of the present invention, a determination is made whether retrieval-target information paired with the retrieval request therefor exists in the result of retrieval performed according to one of the title and the text, which was input as the retrieval request. When retrieval-target information paired with the retrieval request exists, the retrieval-performance evaluation is performed for the retrieving system according to a condition where the retrieval-target information paired with the retrieval request exists in the result of retrieval.
Also, processing of the retrieval-result evaluation is arranged to retrieve multiple items of information according to the retrieval request, ranks the individual results of retrieval according to the degree of conformity to the retrieval request, and performs the retrieval-performance evaluation for the retrieving system according to the ranks, the retrieval-result evaluation being performed according to the condition where the retrieval-target information paired with the retrieval request exists in the result of retrieval when the retrieval-target information paired with the retrieval request exists.
Also, an arrangement may be such that multiple documents having the text and the title thereof are prepared, the retrieval request is assigned one by one, and results of retrieval for the individual retrieval requests are totaled, thereby performing the retrieval-performance evaluation.
When a retrieval request is issued to a retrieving system, the present invention allows quantitative evaluation to be performed as to whether appropriate information is retrieved for the retrieval request. To achieve this, the present invention is arranged such that, when the document having the text and the title of the text is prepared as the retrieval-target document, one of the title and the text is used as the retrieval request, the other one of the title and the heading is used as retrieval-target information, and the retrieval request is input, the present invention evaluates retrieval performance of the retrieving system by checking the type of information retrieved thereby.
For example, a newspaper has a heading and a text for the heading. The heading summarizes contents of the text to be concise, and if the title is input, and the text paired with the heading is retrieved, appropriate retrieval processing has been performed.
In this way, according to the present invention, when one of the title and the text is used as the retrieval request, the other one of the text and the title is used as the retrieval-target information, evaluation is performed for retrieval performance of the retrieving system. Therefore, retrieval performance of individual retrieving systems can be evaluated precisely and quantitatively.
According to the evaluation method, for example, if a title is used as the retrieval request, multiple items of information are retrieved according to the title, the retrieval results are ranked based on the degree of conformity to the title input, and retrieval-process evaluation is performed for the retrieving system according to the ranks. Accordingly, retrieval-performance evaluation of the retrieving system can be performed quantitatively. When this is implemented for individual retrieving systems, comparison can also be performed easily for the individual retrieving systems.
Furthermore, by preparing multiple documents having a title, multiple retrieval processes can be tried in a single retrieving system so as to totally check the results of retrieval in the individual retrieving systems, thereby allowing retrieval-performance evaluation to be performed even more suitably for the retrieving systems.