1. Field of the Invention
The present invention relates to a method, apparatus, and recording medium for retrieving an optimum template pattern used in, for example, correcting and evaluating a translation, and in particular to a method, apparatus, and recording medium allowing an optimum template pattern which is most appropriate for any given input sentence to be obtained.
2. Description of the Prior Art
In technical translator training courses, a method is generally used in which students send their assigned translations to the tutoring center, where the translations are corrected, evaluated, and sent back to the students, and the students review the corrections and know their grade.
Conventionally, a method has been used to correct translations in which a large number of translations sent back from students are distributed among a plurality of tutors and each of the tutors manually corrects errors in the translations. However, there are problems with this method: the manual correction requires much time, and besides that, since it is not necessarily easy to employ tutors having ability at a certain level or higher, the corrections may vary depending on the tutors.
The inventors previously proposed a language translation correction support apparatus that provides a result nearly equal to that from manual correction by experts, as disclosed in Japanese Patent Laid-Open No. 9-325673 specification.
Object of the Invention
In the translation correction support apparatus previously proposed by the inventors, a plurality of template patterns are provided, each of which corresponds to a model translation, a template pattern which matches student""s translation is determined, the translation is regarded as templates similar to the matching template pattern, then differences between each template and the student""s translation are determined to produce resulting corrections. Very precise corrections can be obtained by this apparatus if the student""s translations have only few errors.
However, if the levels of student""s translations widely vary as in, for example, general language schools, it is impossible to provide numerous template patterns corresponding to model translations in consideration of all possible variations. As a result, it often occurs that a translation matches none of the translation patterns of the model translation provided beforehand and, at worst, no template pattern of the model translation corresponding to the translation can be presented.
Such a problem occurs not only in the translation correction support apparatus, but also often in information retrieval in which certain information is retrieved based on a query in an information retrieval system using the Internet, for example.
The above mentioned problem arises from the fact that template patterns for a model sentence (model translation) corresponding to input sentences (student""s translations) are provided beforehand and a template pattern that matches a given input sentence is searched from among those template patterns. It is expected that, if each time an input sentence is provided a template pattern for an appropriate model translation is constructed which matches the input sentence, the most appropriate model translation template pattern for the input sentence may be obtained.
The present invention was devised based on the above-mentioned insight and it is an object of the present invention to provide a method, apparatus, and recording medium for optimum template pattern retrieval, wherein the same template pattern as input sentence""s template pattern intended by the person who has input that sentence can be obtained as the optimum template pattern for a model sentence.
To achieve the above-mentioned object, the present invention provides a method for retrieving an optimum template pattern, by an Finite State Automaton scheme wherein a set of templates for a model sentence (model translation) is provided beforehand, the set of templates being constructed by arranging a plurality of template blocks containing an arbitrary number of sentence components called states including grammatically correct and/or erroneous components , and a candidate template pattern which is most appropriate for a given input sentence is selected from all the candidate template patterns which can be constructed from the set of template; said method characterized by: assigning scores to all the words in the set of templates according to their importance; calculating an optimum level comparison value of each of all the candidate template pattern by taking the total of the scores of all the words used in each candidate template pattern as the denominator and the total of the scores of words in each candidate template pattern which match words in the input sentence as the numerator; and determining as the optimum template pattern a candidate template pattern having the largest optimum level comparison value among optimum level values which provide the largest numerator.
That is, according to the present invention, a set of templates is provided beforehand which is constructed by arranging a plurality of template blocks containing an arbitrary number of sentence components called states including grammatically correct and/or erroneous components. From the set of templates, a large number of candidate template patterns corresponding to a given input sentence can be constructed by linking any one of sentence components in a higher order template block with any one of sentence components in the next higher order template block in sequence. Scores are assigned to all the words in the set of templates according to their importance. Thus, an optimum level comparison value of each of all the candidate template patterns is calculated by taking the total of the scores of all the words used in each candidate template pattern as the denominator and the total of the scores of words in each candidate template pattern which match words in the input sentence as the numerator to determine a candidate template pattern having the largest optimum level comparison value among optimum level values which provide the largest numerator, thereby the same template pattern as input sentence""s template pattern intended by the person who input that sentence can be obtained as the optimum template pattern for the model sentence (model translation).
The present invention is further characterized in that: template set storage means for storing a set of templates for a model sentence, the set of templates being constructed by arranging a plurality of template blocks containing an arbitrary number of sentence components called states including grammatically correct and/or erroneous components; input sentence storage means for storing an input sentence; candidate template pattern retrieval means for comparing each of the template blocks in the set of templates with the input sentence and retrieving all the candidate template patterns which can be constructed from the set of templates; candidate template pattern storage means for storing the retrieved candidate template patterns; word score storage means for storing scores assigned to all the words in the set of template according to their importance by associating the scores with the words; matching word retrieval means for retrieving a word which matches a word in the input sentence; total score calculation means for calculating total of the scores of all the words used in each of the candidate template patterns; matching word score calculation means for calculating total of the scores of words in each candidate template pattern which match words in the input sentence; optimum level comparison value calculation means for calculating an optimum level comparison value by taking the total of the scores of all the words in each of the candidate template pattern as the denominator and the total of the scores of words in each candidate template pattern which match words in the input sentence as the numerator; and optimum template pattern determination means for comparing the optimum level comparison values of the candidate template patterns to determine a candidate template pattern having the largest optimum level comparison value among optimum level comparison values which provide the largest numerator as the optimum template pattern are provided. Because each template block in the set of templates is compared with the input sentence and all the candidate template patterns which can be constructed from the set of templates are retrieved by the candidate template pattern retrieval means, candidate template pattern corresponding to any given input sentence can be obtained. In addition, because the scores of words are taken into consideration when obtaining the optimum template pattern among a plurality of candidate template patterns, a candidate template pattern containing a larger number of important words is selected as the optimum template pattern if the input sentence contains important words having a high score, thus the same template pattern as that of the input sentence which is intended by the person who inputted the sentence can be determined as the optimum template pattern.
The present invention is further characterized in that it causes a computer to perform processes of: storing a set of templates for a model sentence in a memory area, the set of templates being constructed by arranging a plurality of template blocks containing an arbitrary number of sentence components called states including grammatically correct and/or erroneous components; storing an input sentence in the memory area; storing in the memory area scores assigned to all the words in the set of template according to their importance by associating the scores with the word; and, using an optimum level comparison value of each of all the candidate template patterns which is calculated by taking the total of the scores of all the words used in each candidate template pattern as the denominator and the total of the scores of words in each candidate template pattern which matches words in the input sentence as the numerator to determine a candidate template pattern having the largest optimum level comparison value among optimum level values which provide the largest numerator. The processes mentioned above allows the same template pattern as input sentence""s template pattern intended by the person who inputted that sentence to be obtained as the optimum template pattern for a model sentence, without providing the template pattern for the model sentence beforehand.