Language translation involves the conversion of sentences from one natural language, usually referred to as the "source" language, into another language, typically called the "target" language. When performed by a machine, e.g., a computer, such translation is referred to as automatic language translation or machine translation.
Many different methods for automatic language translation have been proposed and implemented over the last few decades. See Hutchins, W. J. and Somer, H. L., An Introduction to Machine Translation, (Academic Press, N.Y. 1992). Most translation systems utilize mapping via intermediate representation. For example, in the so called "interlingua" translation systems, the intermediate representation is a language-independent representation resulting from an initial analysis of the source language sentence. The intermediate representation is then converted into the target language by a generation phase. See, for example, Nirenburg et al., Machine Translation: A Knowledge-Based Approach, (Morgan Kaufmann, San Mateo, Calif. 1992). A second example of mapping via intermediate representation are the "transfer" translation systems. Such systems include three phases; analysis of the source language sentence into a source representation, conversion of the source representation into a target representation, and generation of a target sentence from the target representation. See, van Noord et al., "An Overview of Mimo2," v.6 Machine Translation, pp. 201-04, 1991.
A second type of translation system can be classified as a "direct" translation system. Such direct methods do not use intermediate representations. Some of the earliest translation systems utilized direct methods; however, they were ad-hoc in nature, depending on large collections of translation rules and exceptions.
Recently, more systematic direct translation methods have been proposed. One such method is based on a statistical model for mapping words of the source sentence into words and word positions in the target language. A drawback of that method is that it ignores the arrangement of phrases in the source and target sentences when mapping a word into its corresponding position in the target language sentence. The method therefore ignores lexical relationships that make one position in the target sentence more likely than another. Brown et al., "A Statistical Approach to Machine Translation," v. 16 Computational Linguistics, pp. 79-85, 1990. In another direct method, a syntax tree is built up simultaneously for the source and target sentences, using special phrase structure rules that can invert the order of syntactic constituents. A drawback of this method is that it does not take into account word to word associations in the source and target languages. See, Wu, D., "Trainable Coarse Bilingual Grammars for Parallel Text Bracketing," 1995 Proc. Workshop Very Large Corpora, Cambridge Mass.
A third direct translation system has been proposed that uses standard left-to-right finite state transducers for translation. Using such standard finite state transducers limits the ability of the method to allow for words in the target sentence being arbitrarily far away from their original position in the source sentence. This is because, for non-trivial vocabularies, the required number of transducer states becomes too large for practical use. See Vilar et al., "Spoken-Language Machine Translation in Limited Domains: Can it be Achieved by Finite-State Models?," 1995 Proc. Sixth Int'l. Conf. Theoretical and Methodological Issues in Machine Translation, Leuven, Belgium.
Thus, there is a need for an improved system and method for automatic language translation.