The discussion below is merely provided for general background information and is not intended to be used as an aid in determining the scope of the claimed subject matter.
Using by way of example speech synthesis, text-to-speech technology allows computerized systems to communicate with users using synthesized speech. Some speech synthesizers use letter-to-sound (LTS) conversion to generate the pronunciation of out of the vocabulary (OOV) words. Person names are commonly OOV as well as may originate from other languages. This is true, for example, with English where many person names originate from other languages and their pronunciations are heavily influenced by the rules in the original languages. Therefore, the accuracy of name pronunciation generated from a typical English LTS is normally low. To improve the performance, identifying language origin of a word can be critical.
Language identification has been done for spoken languages. Using one technique, a speech utterance is first converted into a phoneme string by a speech recognition engine, then the probabilities that the phoneme string belongs to each candidate language are estimated by phoneme N-grams of that language, and finally the language with the highest likelihood is selected. Language identification has been also performed on web documents, in which more information such as HTML (Hyper Text Mark-up Language) tag and special letters in different languages can help a lot.
However, the task of identifying language origin of person names in a language, particularly, English can be more difficult during text conversion because all non English characters are normally converted into similar English characters. For example, the German name ‘Andrä’ is written as Andra in English and the French name ‘Aimé’ is written as Aime. Hence, many times the letter string is the only information available.
Letter based N-grams have also been used with some success to identify the language origin of names among several candidate languages given a letter string. Typically, a letter based N-gram model has to be trained for each candidate language beforehand. When a new name is analyzed, it will be scored by all letter based N-grams and the language for the letter based N-gram having the highest likelihood will be output as the language hypothesis. Although this technique can be used to hypothesize the language of origin of a word, room exists for improvement when determining language origin from a letter string.