Various types of machine translation systems are commonly employed to translate text or speech from a source language to a target language. Examples of machine translation systems include rule-based machine translation systems, example-based machine translation systems, and statistical machine translation (SMT) systems. In contrast to approaches utilized for rule-based machine translation systems or example-based machine translation systems, SMT systems can generate translations based upon statistical translation models with parameters derived from analysis of bilingual text corpora.
A phrase translation model, also known as a phrase table, can be one of the models of a phrase-based SMT system. A phrase table is commonly constructed by implementing a two-phase approach. As part of such two-phase approach, bilingual phrase pairs can be extracted heuristically from automatically word-aligned training data. Thereafter, parameter estimation can be performed. Conventional parameter estimation techniques oftentimes include assigning each phrase pair a score estimated based on counting of words or phrases on the same word-aligned training data.