1. Field of the Invention
The present invention relates to technology for evaluating a summary of an article, a document or the like using a computer. More specifically, the present invention relates to technology for automatically performing evaluation processing of summaries of articles using supervised machine learning methods.
2. Description of the Related Art
In recent years, processing to automatically summarize an article or a document using a computer has become more widespread as information technology has developed. This has meant that correct evaluation of summaries made using various automatic summary processing methods has also become more and more important.
Summary processing can mainly be classified into two types, summarizing by extracting important sentences from a target article and summarizing by freely generating sentences based on content of a target article. Summarizing by extracting important sentences is processing where sentences present in a target article are extracted according to a prescribed summarizing rate so as to compose a summary. Summarizing by generating sentences freely is processing where a person generates sentences freely standing on content of a target article.
As a way of evaluating summarizing by extracting important sentences, it is possible to evaluate by automatically processing using information which sentences should be extracted from an article. For example, a degree of importance indicating the extent to which a sentence should be extracted as a summary is pre-assigned to each sentence in the article, and the summary is then evaluated by adding up the degrees of importance of each extracted sentence. On the other hand, it is difficult to automatically evaluate a summary composed freely. This is because it is probable that a number of appropriate summaries should be obtained for a single article and it is therefore very difficult to prepare correct information for all appropriate summaries.
In the related art, then, evaluation of summaries composed freely by hands has been carried out based on a person's knowledge or experience. A method shown in the following cited reference 1 exists as a method for automatically evaluating summaries in the related art. In the processing method of cited reference 1, summary evaluation is carried out using the recall ratio, relevance ratio, F value based on the conformity rate between sentences extracted by processing of a computer and important sentences selected by a person in advance.
Evaluation can also be discerned for freely made summaries by determining the degree of similarity between the made summary and a correct summary prepared by a person in advance using frequency vectors for words.    “Cited reference 1: Shu Nobata Et. al, Important sentence extraction system integrating a plurality of evaluation criteria, Proceedings of the seventh annual conference of the Language Processing Society, pp301-304, 2001 (, )”.
In the processing for evaluating the freely made summary shown in cited reference 1, the degree of similarity with a target summary and a prepared correct summary is determined using word frequency vectors. There is therefore a tendency that the evaluation value of the summary becomes high if the distribution of keywords showing the content of the summary is similar to the distribution of keywords for a summary taken to be correct. Namely, if a summary includes certain words existing in the correct summary, this summary will receive a fixed good evaluation even if the form of this summary as a passage is extremely difficult to read. It is therefore of a problem that this kind of summary receives evaluation as a good summary.
In the related art, a specialist evaluates a summary composed by hands. However, it goes without saying that evaluation by a specialist depends upon the experience and skill of the evaluator. There are therefore cases where the evaluation for the same summary may differ when the evaluator is different or where the evaluation differs because the period for evaluation is different even when the evaluator is the same. Thus, when a summary by hands is evaluated based on the experience and skill of the specialist as in the related art, there is no reproducibility of evaluation of the summary but much difficulty of impartial evaluation of the summary.
Automatic evaluation processing for summaries including free-composed summaries which is not influenced by the subjectivity of the evaluator and where an objective evaluation that can be reproduced is required.
Comparison of evaluations of summaries automatically generated by a computer and summaries freely composed by a specialist is now considered. Summaries generated by a computer are generally lower in summary accuracy with regards to appropriateness of the summarized content and fluency of the sentences than summaries composed by a person. There are therefore many cases where the naturalness of a summary produced by a computer is not to the extent that it is not possible to discern the summary from a summary produced by a person.
On the assumption that “A good summary” means the provision of a summary which is natural to the extent that it is difficult to discriminate between this summary and a summary by hands, this “A good summary” means providing a summary which is good to the extent that the sentence structure and summary content of the summary produced by computer is similar to that of a summary produced by a person. It can therefore be understood that categorizing by “a summary by a computer” and “a summary by hands” can be used as an evaluation of a summary.