The field of mobile communication has seen rapid growth in recent years. Due to growth in the geographic coverage and bandwidth of various wireless networks, a wide variety of portable electronic devices, which include cellular telephones, smart phones, tablets, portable media players, and notebook computing devices, have enabled users to communicate and access data networks from a variety of locations. These portable electronic devices support a wide variety of communication types including audio, video, and text-based communication. Portable electronic devices that are used for text-based communication typically include a display screen, such as an LCD or OLED screen, which can display text for reading.
The popularity of text-based communications has surged in recent years. Various text communication systems include, but are not limited to, the Short Message Service (SMS), various social networking services, which include Facebook and Twitter, instant messaging services, and conventional electronic mail services. Many text messages sent using text communication services are of relatively short length. Some text messaging systems, such as SMS, have technical limitations that require messages to be shorter than a certain length, such as 160 characters. Even for messaging services that do not impose message length restrictions, the input facilities provided by many portable electronic devices, such as physical and virtual keyboards, tend to be cumbersome for inputting large amounts of text. Additionally, users of mobile messenger devices, such as adolescents, often compress messages using abbreviations or slang terms that are not recognized as canonical words in any language. For example, terms such a “BRB” stand for longer phrases such as “be right back.” Users may also employ non-standard spellings for standard words, such as substituting the word “cause” with the non-standard “kuz.” The alternative spellings and word forms differ from simple misspellings, and existing spell checking systems are not equipped to normalize the alternative word forms into standard words found in a dictionary. The slang terms and alternative spellings rely on the knowledge of other people receiving the text message to interpret an appropriate meaning from the text.
While the popularity of sending and receiving text messages has grown, many situations preclude the recipient from reading text messages in a timely manner. In one example, a driver of a motor vehicle may be distracted when attempting to read a text message while operating the vehicle. In other situations, a user of a portable electronic device may not have immediate access to hold the device and read messages from a screen on the device. Some users are also visually impaired and may have trouble reading text from a screen on a mobile device. To mitigate these problems, some portable electronic devices and other systems include a speech synthesis system. The speech synthesis system is configured to generate spoken versions of text messages so that the person receiving a text message does not have to read the message. The synthesized audio messages enable a person to hear the content of one or more text messages while preventing distraction when the person is performing another activity, such as operating a vehicle.
While speech synthesis systems are useful in reading back text for a known language, speech synthesis becomes more problematic when dealing with text messages that include slang terms, abbreviations, and other non-standard words used in text messages. The speech synthesis systems rely on a model that maps known words to an audio model for speech synthesis. When synthesizing unknown words, many speech synthesis systems fall back to imperfect phonetic approximations of words, or spell out words letter-by-letter. In these conditions, the output of the speech synthesis system does not follow the expected flow of normal speech, and the speech synthesis system can become a distraction. Other text processing systems, including language translation systems and natural language processing systems, may have similar problems when text messages include non-standard spellings and word forms.
While existing dictionaries may provide translations for common slang terms and abbreviations, the variety of alternative spellings and constructions of standard words that are used in text messages is too broad to be accommodated by a dictionary compiled from standard sources. Additionally, portable electronic device users are continually forming new variations on existing words that could not be available in a standard dictionary. Moreover, the mapping from standard words to their nonstandard variations is many-to-many, that is, a nonstandard variation may correspond to different standard word forms and vice versa. Consequently, systems and methods for predicting variations of standard words to enable normalization of alternative word forms to standard dictionary words would be beneficial.