(1) Field of the Invention
The present invention generally relates to a language conversion system, and more particularly to a language conversion system capable of being used for the document preparation, the natural language processing, and the translation between different languages. The language conversion system according to the present invention is applicable, for example, to machine translation systems, document preparation systems, CAI (Computer Assisted Instruction) systems, foreign language preparation supporting system, and other OA (Office Automation) devices such as word processors and personal computers.
(2) Description of the Related Art
FIG. 1A shows a conventional machine translation method. This method is generally referred to as a structure transfer method. In this method, first, an original is analyzed, and a intermediate structure in a source-language (e.g. Japanese) which structure can be easily transferred to an intermediate structure in a target-language (e.g. English) is generated based on the analyzed results. The intermediate structure in the source-language is then transferred to that in the target-language. The translation in the surface-level of the target-language is generated from the intermediate structure therein. However, even if the logically precise translation is created in accordance with the above method, it is apt to be difficult for natives of the target-language to understand the translation between languages, such as Japanese and English, with the different cultural backgrounds.
FIG. 1B shows another conventional machine translation method. In this method, a structure independent from any languages is generated based on the analyzing results obtained by the analysis of the original in the source language. The translation in the surface-level of the target-language is then generated from the structure. However, the difference between the cultural backgrounds of the source-language and the target-language is not considered in the above method. Thus, in this case also, it is apt to be difficult for natives of the target-language to understand the translation.
The following prior arts have been known.
1) Japanese Laid Open Patent Application No. 63-44276 "AUTOMATIC GENERATOR FOR GENERATED SYTAX":
In this system, an idea structure is obtained by the analysis of the original in the source-language, as shown in FIG. 1B, and the surface-level of the target-language is then generated from the idea structure. This system obtains the conceptual structure which is an intermediate structure independent from any languages. However, the language structure such as the intermediate structure independent from any languages can not be obtained in actuality. As a result, the conceptual structure obtained by this system may depend on the source-language. Thus, it may be difficult for natives to understand the translation generated therefrom.
2) Kumano et al., "User-Cooperative Japanese Sentence Generation System", National Conference of the 42nd information Processing Society (First term of 1991):
This system has been proposed on the assumption that a user of the translation carries out operations. That is, if a translation having contents different from those of the original is obtained, the user supplies information of changing and correction and supplemental information to the system. As a result, the translation having the high quality can be obtained. Hense, the target-language knowledge is not a little required for the user.
3) Saito and Tomira, November 1984, "Automatically Writing a Letter in a Foreign Language", Symposium of the Natural Language Processing Technology:
This system has been proposed for a user having no knowledge of the target-language, in which system a letter is written in the target-language using only the knowledge of the source-language. According to this system, the letter having the high quality can be written. However, as examples of sentences, each of which examples has blanks, are used and required words are filled in the blanks, sentences other than the examples can not be translated.
4) Nomiyama, "Lexical Selection Mechanism in Machine Translation Using Target language Knowledge", National Conference of the 42nd Information Processing Society (First term of 1991):
As relationships between verbs and nouns cooperated with each other depend on languages, it is hard for users to read translation. In this translation technique, suitable terms for translation are selected with consideration of the this matter. However, a problem is not yet solved in that it is difficult to understand the expression of the whole translated sentence.
5) Satoshi Sato and Makoto Nagao, "Toward Memory-based Translation":
A large number of examples of translated sentences which can be relatively easily understood are gathered along with originals corresponding to the examples, and the examples are stored in a system. An sentence closest to an original input is selected from the examples of translated sentences. Parts of the selected sentence different from those of the original are detected and corrected. According to this system, a translation having a relatively high quality can be obtained. However, it is impossible to gather examples for all sentence patterns. Thus, if there is no example corresponding to an original input, the translation can not be obtained at all.
In addition, there has been proposed a system in which contents of a source-language sentence is explained by the target-language using an idea center and target-language sentence patters driven by the idea center so that a translation is obtained, the idea center (hereinafter for the sake of simplify referred to as an IC) being a central term of an idea to be explained by a sentence. Each of the target-language sentence patterns is formed of elements essential for (close to) the IC, the elements essential for the IC being obtained by analysis of actual target-language sentences.
The popular conventional system converts the analysis (tree structure) of the source-language into the tree structure of the target-language in parts, and generates the target-language. According to this system, even if the lack of knowledge is compensated for by the interaction with the user, the system tends to take a source-language-based structure. As a result, the sentence form is unnatural for the target-language, and the expression is unnatural.
On the other hand, in the system which obtains all the information through inquiries with the user, it is necessary to prepare an extremely large amount of data in order to cope with a wide range of documents. In addition, the operation is machine initiated from the beginning and makes the user feel under control of the machine.
The information required in the target-language is very often missing in the sentence of the source-language. For this reason, the system in which there is no interaction with the user tends to frequently generate incomplete output.
Moreover, the information necessary in the target-language may not exist in the original sentence. According to the conventional system, the translation is in many cases completed by supplementing such necessary information by a default process. However, erroneous information may be added and necessary information may be dropped if the default process is insufficient, and this method tends to generate an unnatural translation.
On the other hand, according to the translation system which uses the examples of translated sentences, an extremely large amount of examples of translated sentences must be prepared in order to cope with a wide range of documents.