Different languages may use different standard word orders in conventional sentence structures. For example, English typically uses a subject-verb-object sentence order, while German may use a different word order resulting from a preference for the verb to be the second word in each sentence. As another example, Japanese typically uses a subject-object-verb sentence structure.
When translating from one language to another with automated techniques such as machine translation, it may be necessary to identify and account for differences in sentence structure or syntax, i.e., the order in which words typically are placed in a sentence. If these differences are not accounted for, the translation may be inaccurate or have a different implied or explicit meaning from the original source sentence. For example, mechanically translating from a subject-verb-object language to a subject-object-verb language may result in a mis-translation, if the verb is not moved to the correct position in the target language. Thus, the target sentence may be read incorrectly or may be partially or completely nonsensical or confusing in meaning. An incorrect move also may impact the effectiveness of other models, such as a related language model, which may negatively impact fluency and translation accuracy.
To address this issue, machine translation systems may use pre-ordering techniques when translating between languages that use different sentence structures. Pre-ordering techniques attempt to rearrange a source sentence to match the target language structure, prior to translating the individual tokens in the source sentence. Some conventional pre-ordering techniques use a supervised parser to achieve an accurate ordering. Generally, supervised parsers include systems that automatically annotate sentences with their syntactic structure, based on human-generated annotations of syntactic structure on training examples. Other conventional pre-ordering techniques may attempt to re-order without the use of any parser.