The present disclosure relates to machine translation.
Machine translation systems are used to automatically translate text, speech, or other content from a source language to a target language. Conventional machine translation systems include dictionary-based machine translation, model-based machine translation, and instance-based machine translation.
Instance-based machine translation systems typically include a repository of source data and corresponding repository of target data. For example, a collection of documents in the source language and corresponding translated documents in the target language. An input segment, for example a sentence, is translated from the source language to the target language by identifying a matching segment in the example source data. Typically, a matching segment in the source data is found by testing the input segment one at a time against each segment in the source data to identify an exact match. The target language segment corresponding to the matching source segment is then provided as the translation of the input segment.