Computer software, for example, text editors, enable users to enter characters of a writing system, such as the Roman characters used, in writing text in a language, e.g., English, using input devices, e.g., keyboards, and in response, display the entered characters in a display device. In many writing systems, sequences of characters form words. For example, in writing English, sequences are formed by arranging characters including consonants and vowels adjacent to each other. A consonant or vowel that forms the first letter of a word is a first letter sequence of the word. When a second letter is added to the first letter, the combination of the first letter and the second letter forms a second sequence, and the first sequence is a subsequence (and more specifically, a prefix) of the second sequence. A third sequence is formed when a third letter is added to the second sequence, and so on.
Such letter sequences can be encoded by numerical values and stored in a computer memory. The most prevalent encoding system is Unicode, which provides a unique number for every character. The Unicode system readily facilitates text entry of many Western-style languages in which words are formed by sequential entry of characters. Typically, entry of the letters in a Unicode input sequence corresponds to the phonetic sound of the word. However, in certain languages, there are characters that, when entered, results in a particular sequence that does not conform to a corresponding writing sequence that a native speaker would use when writing the characters on paper. Indic languages, for example, include consonants, vowels, and combinations of them whose Unicode input sequences do not conform to their writing sequences. Thus, users that are fluent in such languages often must input Unicode characters in a sequence that is different than would be used when writing without the aid of computer device, e.g., when using pen and paper.
For example, Indic characters include symbols representing consonants, vowels, and dependent markers that also represent vowels and other variations. A common combination is a consonant and a vowel. In the Indic language, rather than writing a character for the consonant immediately followed by a character for the vowel, a single character of the consonant and a dependent vowel marker representing the vowel is written. There is, however, no Unicode input sequence that corresponds to how a fluent writer would actually write the single character that is the combination of the consonant and a dependent vowel marker. Thus, in using conventional text input software, the fluent writer typically must input a Unicode sequence for certain characters that corresponds to a sequence of characters that the fluent writer would not otherwise use when writing.