1. Field of Invention
The present invention relates to a method of machine translation and a system of machine translation, and more particularly to a method of machine translation and a system of machine translation based on formalization processing.
2. Description of Related Arts
The inventor began his research on artificial intelligence (computer simulation of human intelligence) at the end of the 1970s. The core of artificial intelligence is knowledge processing (acquisition and use of knowledge); the basis of knowledge processing is knowledge representation (formal representation of commonsense knowledge and professional knowledge). A knowledge representation method for formally representing commonsense knowledge and professional knowledge (especially commonsense knowledge) universally and fully is a major problem that the artificial intelligence world has been eager to solve for a long time.
Natural language understanding technology aimed at natural language man-machine interface (natural language communication between a user and a computer) is an important technology of artificial intelligence. The basis of natural language communication between a user and a computer is formalizing a non-formal natural language. A method for formalizing a non-formal natural language is a major problem that the artificial intelligence world has been eager to solve for a long time.
Machine translation technology is an important technology of artificial intelligence. A method of machine translation which makes a substantial breakthrough in translation quality is a major problem that the artificial intelligence world has been eager to solve for a long time. The existing machine translation mainly includes the following two categories: machine translation based on direct transformation from a source language into a target language and machine translation based on an intermediate language. A machine translation system based on direct transformation from a source language into a target language performs in turn transformation at the word level, transformation at the lexical level, transformation at the syntactic level, transformation at the semantic level, and transformation rules apply only to a specific pair of languages. A machine translation system based on an intermediate language maps a source language onto an assumed intermediate expression first, and then maps the intermediate expression onto a target language. So far, there has been no universal intermediate language. No existing method of machine translation makes a substantial breakthrough in translation quality. The inventor holds that formalizing a non-formal source language is the basis of high-quality machine translation and that no existing method of machine translation makes a substantial breakthrough in translation quality just because no existing method of machine translation formalizes a non-formal source language.
In 1988, the inventor published a paper entitled Meaning Formalization: A Theory about Natural Language Understanding, Automatic Translation, Knowledge Representation at a symposium on natural language understanding of the Chinese Association on Artificial Intelligence (CAAI).
In 1989-1991, the inventor as a visiting scholar of the Intelligence Technologies and Systems Laboratory of Tsinghua University, cooperating with a computer worker and using the machine translation method described in the above paper, developed an experimental Japanese-Chinese machine translation system, which translated correctly a number of long sentences of complicated structure.
In 1998, the inventor submitted an application to the Patent Office of China for a patent on the invention “the Meaning Formalization Method of Automatic Translation” (application number: 98110793.1). The invention was a development of the machine translation method described in the above paper. It had the following main technical features: 1 Translation modes are stored in a computer storage; a combination mode of a source language and a number of corresponding transformation modes for transforming the source language into a number of target languages constitute a translation mode; a combination mode contains grammatical attribute marks and semantic attribute marks of the component segments and contains a grammatical attribute mark and a semantic attribute mark of the composed segment; a basic combination rule, i.e. composing level by level according to combination modes, and a basic transformation rule, i.e. transforming level by level according to transformation modes, are stored in a computer storage. 2 In the process of composing level by level according to combination modes, a computer processor finds all the combination modes which can be used in the computer storage and chooses one of the combination modes in the choice order that a mode with a larger total number of use is prior to a mode with a smaller total number of use; if the computer processor finds out that there does not exist a combination mode which can be used in the computer storage, the processor performs backtracking. 3 If a combination mode contains marks used as signs of semantic relations between component segments and contains marks used as signs of semantic relations between component segments and a formed segment, combination modes and combination rules can be used in natural language understanding. The invention (1998) was a method of machine translation for translating a non-formal source language into a non-formal target language in an automatic way. An obvious limitation of the invention was that the translation could not be entirely correct. The only application of the invention was information exchange in a certain degree between people speaking respective native languages.
In 1999, the inventor submitted an application to the Patent Office of China for a patent on the invention “the Meaning Formalization Method of Computer-aided Translanguage Information Exchange” (application number: 99113471.0). It had the following main technical features: 1 A lexicon in which synonyms of a number of languages correspond to each other, combination marks and relation marks are stored in a computer storage; a user expresses information on a display device of a computer with words of a certain language, combination marks and relation marks, and then the computer transforms the displayed words of the language into corresponding words of another language. 2 Words of the same form and different meanings are distinguished from each other by attached words of similar meanings. 3 Key component marks (The object of the key component is identical to the object of the formed language segment). 4 A list of relation marks in which relation marks of a number of languages correspond to each other is stored in a computer storage; the computer transforms displayed relation marks of a language into corresponding relation marks of another language. 5 After a user inputs a word, a computer displays a number of words of similar meanings to be chosen by the user.
A knowledge representation method is a method for describing knowledge as a data structure that a computer is able to deal with. The following are common knowledge representation methods: predicate logic representation, production representation, semantic network representation, frame representation, object-oriented representation, state space representation, etc. The invention (1999) was in essence 5 a knowledge representation method. It is natural for the invention or any other knowledge representation method to contain a lexicon in which synonyms of a number of languages correspond to each other. It should be pointed out that the invention did not accord with natural languages because the lexicon of the invention was limited to 10 notional words and relation marks displaced function words (vocabulary of any natural language comprises notional words and function words).