1. Field of the Invention
The present invention relates to a method and an apparatus for developing a transfer dictionary used in a transfer-based machine translation system.
2. Related Art
In order to facilitate the understanding of new information written in any foreign language, a machine translation system that mechanically translates sentences from one language into another language (for example, English into Korean) have become widely used.
Most of commercial machine translation systems are embodied in a “transfer-based” mechanism, which is comprised of the steps of parsing, transferring and generating. First, in the parsing step, a given source language sentence will be parsed using a parsing dictionary and parsing rules to obtain simplified syntactic information of the given sentence, which clarifies the syntactic ambiguity. Such source language syntactic information is generally represented and stored in a tree-type data structure, which is called a “source language syntactic tree.” Second, in the transferring step, the source language syntactic information will be transferred to target language syntactic structure information, which is also a simplified syntactic structure of a corresponding target language sentence, using a “transfer dictionary.” In the transfer dictionary, the corresponding relationships between them are stored on a phrase basis. Finally, in the generating step, the target language sentence is generated based on the target syntactic structure information. For the details of the transfer-based machine translation system, please refer to “Makoto Nagao, Chapter 7 Machine Translation, Comprehension of Natural Language.”
Because syntactic and semantic differences between a source language and a target language are adjusted by referring to the transfer dictionary, the transfer dictionary is a very important factor for translation quality. Conventionally, in order to develop a transfer dictionary, a linguistic expert, who knows the characteristics of vocabularies and syntax about the source language and the target language very well, has to specifically define the corresponding relationship between the source and the target language sentences. Accordingly, the accuracy of the transfer dictionary is too much dependent on the expert's knowledge and it takes much time and cost to develop a transfer dictionary. Furthermore, it requires much effort for maintaining consistency between the entries in the dictionary.
In view of the above problems, the present invention aims at providing a method and an apparatus for efficiently developing a high quality transfer dictionary by minimizing the human's effort.