1. Field of the Invention
This invention relates to language translation. More particularly, this invention relates to automatically translating from a source language to a target language using machine translation.
2. Introduction
Machine translation (MT) has been extensively formulated as a combination of selecting appropriate target-language lexical items and reordering them to form a syntactically correct target sentence. Most of the statistical MT systems rely on local lexical associations between source and target words and therefore strongly depend on word alignment algorithms as an essential part of their training procedure. This type of word associations has also been extended to ordered groups of words (so called phrases) in phrased-base MT systems. This extension not only leads to a better lexical selection, but also some of the word ordering information that is captured in the source-target phrase association makes the ordering part of the translation task easier. However, the translation quality of these classes of approaches depends on the accuracy of the alignment algorithm.
On the other hand, lexical items can be chosen based on the whole source sentence rather than a segment of it. In the Global Lexical Selection (GLS) approach a bag of target words are selected for each source sentence and then, using a language model, a proper arrangement of the words is produced as the target sentence. Thus, using the global information can lead to a better lexical accuracy while eliminating the need for word alignment completely.
Using the entire sentence, gives the GLS model the ability to incorporate some lexico-syntactic information. For instance, in some languages the verb changes according to the subjects gender. In a phrase-based MT system, if the distance between subject and verb exceeds the span of the phrase window, the system would fail to select the correct verb form. In GLS, on the other hand, the subject information is considered while selecting the verb. Generally, more conceptual information are extracted from the source and used to choose the target lexicon. As a result, GLS has better ability to cope with cases when a source concept needs to be expressed with more than one phrase in the target language.
The statistical association between a target word and the source sentence c can be described by the conditional probability p(ei|c) and therefore the target words can be ranked and eventually selected based on the above posteriori information. Given a source sentence, the presence or absence of each target word in the vocabulary is decided by a binary classifier. A maximum entropy model is used and the classifiers are trained with a selection of source n-grams as the model features.
For word ordering, the GLS system relies on constrained permutations to create variants that are rescored using a target language model. A significant drawback of this approach is that larger permutation windows are necessary for better translation quality, but the complexity of permutation limits the size of the window that is computationally practical.