Languages have phonemic orthography where the graphemes (the characters used to write the language) correspond directly to phonemes (spoken sounds). Some languages have a high degree of phonemic orthography more than others. English language, on the other hand, is considered to have irregular mapping of graphemes to phonemes.
Still there are instances in which English is written using phoneme to grapheme mapping. For instance, people with a phonemic language as their mother tongue may write in English using the same mapping of graphemes to phonemes as they would use in their mother tongue. As another example, texts or messages or the like used in social media applications and like applications on the Internet are increasingly presented with phonemic spelling (e.g., spelled the way the word sounds, is pronounced or voiced), for instance, to shorten the time taken to type messages. An example of an English word with irregular grapheme to phoneme mapping is the word “night”, which may appear written as “nite.”
Standard English language processing services, such as Unstructured Information Management Architecture (UIMA) from International Business Machines Corporation (IBM)®, The Natural Language Toolkit (NLTK) and AlchemyAPI from IBM®, may have difficulty in processing such messages or text, since many of the phonemic words would be considered erroneous, and would not map to words in the English language.