Hereinafter, the conventional art will be described with taking as an example an apparatus which is one of language transferring apparatuses, and which translates input speech into another language (hereinafter, referred to as interpretation).
In an interpreting apparatus, interpretation is realized by sequentially interpreting speech recognition for transferring an uttered sentence which is input as a sound signal into an output sentence that is indicated by a word text string, and language translation which receives the sentence indicated by the word text string, and then translates it into a sentence of another language. The language translating section is configured by: a language analyzing section which analyzes the syntactic or semantic structure of the input sentence; a language transferring section which transfers the input sentence into another language on the basis of a result of the analysis; and an output sentence producing section which produces a natural output sentence from a result of the translation.
In a case where the speech recognizing section erroneously recognizes a part of the uttered sentence, or a case where the uttered sentence itself is unnatural in both syntactic and semantic meanings, such as those where chiming, restating, or the like is inserted into the sentence, or where utterance is ended while the sentence has not yet been completed, however, there arises a problem in that, even when a result of speech recognition is input into the language analyzing section, analysis is failed and therefore a result of translation is not output.
In order to solve the problem, a configuration is proposed in which a sentence is divided into phrases, intraphrase rules and interphrase rules are separately made, and incomplete utterance is analyzed by using only the intraphrase rules, thereby enabling a result of the analysis to be output. (For example, Takezawa and Morimoto: The Transaction of the Institute of Electronics and Communication Engineers D-II, Vol. J79-D-II(12)) FIG. 14 shows an example of intraphrase and interphrase rules of the conventional art. In this example, with respect to a corpus example 301 of “KONBAN, SINGLE NO HEYA NO YOYAKAU ONEGAI NE”, intraphrase rules are described in a tree structure such as intraphrase rules 302, on the basis of grammar rules which are common also to written language, and interphrase rules are described in the term of adjacency probability among phrases in a training corpus. For example, the interphrase rules are described as shown in interphrase rules 303.
When an input sentence is to be analyzed, the intraphrase rules are sequentially applied to phrases with starting from the beginning of the sentence. The input sentence is analyzed while the phrases are connected to one another so that, for each phrase, phrase candidates of higher adjacency probability are adjacent to each other. In this sentence analyzing method, even when a part of a sentence is erroneously recognized and usual analysis of the whole sentence fails, phrases of the portion which does not include erroneous recognition can be correctly analyzed. Therefore, a scheme is made so that a translation result can be partially output by translating only the analyzed partial phrases.
In order to solve the problem, another method is proposed in which, unlike the conventional art in which language analysis is performed in accordance with the grammar, parallel-translation phrases of corresponding source language and target language sentences are extracted from uttered sentence examples including uttered sentences which cannot be analyzed by the conventional grammar, a parallel-translation phrase dictionary in which the phrase pair is described in a form that is generalized as far as possible is produced, and language analysis and language transference are performed by using the dictionary. (For example, Furuse, Sumida, and Iida: The Transaction of Information Processing Society of Japan Vol 35, no 3, 1994-3) FIG. 15 shows a language transference rule producing apparatus of the conventional art. Before interpretation is performed, a parallel-translation phrase dictionary is previously produced from an uttered sentence parallel-translation corpus. Also in this method, in consideration a case where a part of words are erroneous or omitted, an uttered sentence is divided into phrases, and intraphrase rules and dependency rules between the phrases are produced. First, a morphological analyzing section 360 analyzes morphemes of the source language sentence and the target language sentence, and transfers the sentences into morpheme strings. Next, a phrase determining section 361 divides the morpheme strings of the source language and the target language in the unit of phrase, and then produces intraphrase rules and dependency relationship rules between the phrases. In this case, each phrase unit is manually determined in consideration that, in partial sentences, the correspondence relationships in the parallel translation are apparent, in addition that each phrase unit is a unit which is semantically consistent. For example, a parallel-translation sentence example of “HEYA NO YOYAKU O ONEGAISHITAINDESUGA” and “I'd like to reserve a room” are divided into two parallel-translation phrases (a) and (b), or (a) “HEYA NO YOYAKU” and “reserve a room”, and (b) “O ONEGAISHITAINDESUGA” and “I'd like to”, and a dependency relationship of “(a) O (b) SURU” and “(b) to (a)” is regularized. The parallel-translation phrases are stored in a parallel-translation phrase dictionary 362, and the dependency relationship between the phrases which is expressed in the form of parallel translation is stored in an interphrase rule table 363. This process is performed on all uttered sentences included in the parallel-translation corpus. This division and dependency relationship of phrases are determined depending on semantic information of a sentence and factors such as the degree at which the sentence is ungrammatical. Therefore, it is difficult to automatically determine them for each sentence. Conventionally, consequently, they are manually determined.
In the sentence analyzing means of the first conventional example, however, phrases to be handled are language-dependent phrases which are dependent only on the source language, and often fail to coincide with phrase units of the target language. Therefore, the means has a problem in that, even when phrases which are correct in the source language are input into the language transferring section, it is often that the phrases cannot be finally accepted. The scheme of the first conventional example is enabled also by using language-independent phrases. In this case, analysis of language-independent phrases must be manually produced, thereby causing further problems in that the development requires a lot of time, and that rule performances are distorted by swinging of criteria of the manual production.
In the method of producing a parallel-translation phrase dictionary in the second conventional example, there is no means for automatically analyzing semantic information and grammatical information of an uttered sentence, and hence such information must be manually produced. Therefore, the method has problems in that the development requires a lot of time, and that rule performances are distorted by swinging of criteria of the manual production. When the target task of an interpreting apparatus is changed, or when the kinds of the source language and the target language are changed, rules which have been once established cannot be applied, and all of the rules must be again produced. Therefore, the development is low in efficiency and cumbersome.
In the phrase dictionary 362 and the interphrase rule table 363, a phrase unit is determined with placing emphasis on the correspondence relationships of the parallel-translation corpus, and the phrase unit is not evaluated whether it is adequate for recognition by the speech recognizing section 364 or not. It is difficult to determine a phrase unit while manually judging whether the phrase is adequate for speech recognition or not. The method has a problem in that, when recognition is performed by using the determined phrase, it is not guaranteed to ensure the recognition rate.