Machine translations involve the translation of information from a source language to a destination language via a computing device. Machine translations may be used to translate, for example, advertisements, government documents, academic works, text messages and emails, social networking posts, recordings of spoken language, and numerous other works.
There may be more than one possible way to translate a word, phrase, or sentence into the destination language. Although each of these possible translations may be correct in certain circumstances, some translations may not make sense in the context of the full translation. For example, assume that the phrase “very good” is translated into German. The word “very” is typically translated as “sehr”. However, the word “good” may be translated in different ways depending on the way that it is used. For example, the “good” in “good morning” is typically translated as “guten”, whereas the “good” in “that food is good” may be translated as “gut”. In this case, both “gut” and “guten” are reasonable translations of the word “good”, but “sehr gut” is a more preferable translation than “sehr guten”.
In order to determine which of multiple possible hypotheses is the most preferable, the translation system may apply a language model. The language model may, for example, consider how the translated word is used in the context of the larger translation. If one of the hypotheses is more likely than the others, the language model may recommend that hypothesis.
For example, consider a request to translate the phrase “la casa blanca” from Spanish into English. The language model may receive two hypotheses for the term “casa”. The first hypothesis may be “house”, while the second hypothesis may be “home”. Both are reasonable hypotheses for the translation of the word “casa” into English. However, in the context of the broader translation, which characterizes the “casa” as being “blanca”, it is more common to translate the word as “house” (i.e., “the white house”) rather than “home” (i.e., “the white home”). The language model is trained to analyze the context in which the translation appears, and identify which of the hypotheses is more likely.