The Internet has made it possible for people to connect and share information globally in ways previously undreamt of. Social media platforms, for example, enable people on opposite sides of the world to collaborate on ideas, discuss current events, or share what they had for lunch. In the past, this spectacular resource has been somewhat limited to communications between users having a common natural language (“language”). In addition, users have only been able to consume content that is in their language, or for which a content provider is able to determine an appropriate translation.
While communication across the many different languages used around the world is a particular challenge, several machine translation engines have attempted to address this concern. Machine translation engines enable a user to select or provide a content item (e.g., a message from an acquaintance) and quickly receive a translation of the content item. Machine translation engines can be created using training data that includes identical or similar content in two or more languages. Multilingual training data is generally obtained from news reports, parliament domains, “wiki” sources, etc. However, the machine translation engines created using this traditional multilingual training data have proven to be less than perfect. This is in part due to imperfections in the training data and in part because of the inability of the machine translation engine creation process to correctly create mappings between language phrases. As a result, the translations created by machine translation engines are often distrusted.
The techniques introduced here may be better understood by referring to the following Detailed Description in conjunction with the accompanying drawings, in which like reference numerals indicate identical or functionally similar elements.