1. Field of the Invention
The present invention relates to an apparatus and a method for processing a natural language and an apparatus and a method for speech recognition which are suitably used, for example, for processing a language input in a literal or speech form to recognize or translate it.
2. Description of the Related Art
Many studies have been made about processing written languages and spoken languages used by human beings (generally referred to as natural language processing, hereinafter) by using, for example, a computer or the like. In natural language processing methods used in many of these studies, however, various phenomena of language occurring in natural language are previously described in an abstract form, i.e., as grammatical rules by specialists and are processed on the basis of such rules. The processing methods therefore entail a problem due to the difficulty in describing grammatical rules.
That is, it is almost impossible even for a specialist to describe abstract grammatical rules such as to completely cover various phenomena of language occurring in processes of communication between human beings. Therefore, some expression of language not covered by the grammatical rules may be erroneously processed. Generally, one who finds a phenomenon of language does not well know how to modify the corresponding grammatical rules. If the grammatical rules are modified, a bad effect may result such that phenomena of language normally processable before modification become impossible to process.
Recently, as a fundamental means for solving this problem, natural language processing using examples of the actual use of a language has been studied extensively. This kind of processing is based on the method of preparing an example data base in which a large number of examples of the actual use of a language are registered instead of abstract grammatical rules, searching the example data base for one of the examples similar to input text data to be processed and performing natural language processing on the basis of the example searched out.
For example, Japanese Patent Laid-Open Publication No. 276367/1991 discloses an example-initiative machine translation system to which such natural language processing is applied. In this machine translation system, a large number of sets of examples of original sentences and examples of corresponding translations are registered in an example data base. When an original sentence written in a predetermined language is input as an input sentence, the example data base is searched for the example most similar to the original sentence. The original sentence is translated according to the corresponding translation of the example thereby obtained.
In the case of speech language processing to which natural language processing using examples of the actual use of a language is applied, an ordinary speech recognition apparatus is used to determine a result of recognition of input speech, an example data base is searched for one of the examples most similar to the speech recognition result obtained as an input sentence, and translation or the like is performed by using the example searched out.
If a natural language is used as a communication means between human beings, a sentence in a flow of conversation ordinary has such a meaning as to reflect the flow before the sentence, which should be called the context or background of conversation. Therefore, the context (conversation background) is thought to be an important factor of natural language processing.
The conventional methods of using examples of the actual use of a language, however, entail a problem described below. When one of examples of the actual use of a language used to process an input sentence is searched for, the degree of similarity between the input sentence and the examples is calculated only with respect to the similarity of the meanings of the words thereof defined in a thesaurus in which words are hierarchized as elements of a tree structure on the basis of the similarity of their meanings (concepts). Also, the context is not taken into consideration. As a result, it is uncertain whether an example searched out is truly suitable for processing the input sentence.
In the conventional speech language processing, speech recognition processing of a language is performed before processing of another natural language, a speech recognition result thereby obtained is determined, and an example of the actual use of the language most similar to the speech recognition result is searched for. Natural language processing such as machine translation is performed by using the example thereby found. In this processing, therefore, it is difficult to obtain a correct translation result if the speech recognition result is erroneous.
Further, in the conventional speech language processing, the probability or likelihood of probable speech recognition results by speech recognition processing is not taken into consideration. Therefore, it is also uncertain whether an example searched out with respect to an input sentence is truly suitable for processing the input sentence.