Nowadays, the operating systems of computers support multiple languages. Typically, a large set of languages are installed as part of the Operating System (OS) installation, and additional languages may also be installed as required. With multiple languages installed on a computer, a user can compose documents that contain more than one language. Recipients of these documents must have the same languages installed on their computer to read or edit the documents. Many messaging systems, such as instant messaging (IM) and email applications, also support multiple languages. Examples are IBM® Sametime® and IBM Lotus® Notes®.
The ease with which character data is input into computer systems goes largely unnoticed by today's software users and, for that matter, by most software developers as well. The task of inputting characters is trivial for many scripts that have a small number of alphabetic characters, as is the case with the Latin script. When a script has a small number of characters, each character can be directly assigned to an individual key on a keyboard. To input a character one simply depresses the appropriate key. This strategy breaks down, however, when scripts, such as Japanese, possess a large number of characters. The challenge of inputting scripts with numerous characters requires that the keyboard be used in a different fashion than most users are accustomed to. The methodology that has been created to input these scripts is called an Input Method Editor (IME).
An IME acts as an intermediary between a software application and a user and allows computer users to enter complex characters and symbols, such as Japanese characters, using a non-Japanese input device. Operating System software typically includes standard IMEs that are based on the most popular input methods used in each target market. These include: Japanese, Korean, Chinese (which is subdivided into Traditional and Simplified), Greek, and Hebrew, as well as other scripts, such as those which use the Arabic or Cyrillic alphabets.
IMEs may simply carry out transliteration i.e. a mapping from one script system to another. For example, the user enters Latin characters via a Latin character keyboard or other input means, and the IME converts each character entered into a Cyrillic character. However, for more complex writing systems the composition of text may comprise more steps.
It is useful to take a look at one of the more complex writing systems, such as the Japanese writing system, before explaining how a user enters such characters using an IME. The entire Japanese written language comprises more than 50,000 characters, of which about 10,000 are in common use. The complexity of the characters and the large number of them requires some organization to simplify reading and writing. The Japanese writing system is organized into two categories: Kana and Kanji.
Kana is an alphabet of written phonetics or syllabary that represents Kanji. The Kana syllabary itself is further broken down into two subsets: Katakana and Hiragana both of which represent the same set of phonetic syllables. The Katakana set of phonetic syllables are written in an angular form and are used to represent names and words that come from foreign languages other than Chinese and Korean. The Hiragana characters are written in a cursive form and are used to represent all native Japanese phonemes and words.
Kanji characters are non-phonetic characters that represent ideas or concepts and that originate from Chinese ideographs. Kanji characters are commonly referred to as ideographs and are comprised of units, known as radicals, and other, non-radical units. For example, the radical ‘rain’ is used to construct the Kanji character for ‘cloud’. Radicals themselves are constructed from even smaller units, called strokes, which are lines that are drawn in one continuous motion.
Using an IME and non-Japanese input device, the user composes each Japanese character in one of several ways: by radical, by stroke count, by phonetic representation, or by typing in the Japanese character's numeric encoding index.
Japanese input devices have Hiragana characters on the keys and combinations of Hiragana characters are lumped into potential matches for Kanji characters (there are lots of homonyms however). In Chinese input devices the keys represent radicals. A Han character is selected in response to the entry of a plurality of radicals in a particular order. The user of a computer system identifies to the OS of the computer system the language setting of the input device which is to be used with the computer. The OS can then identify the particular characters represented by keys selected by the user.
A problem exists in messaging systems, where a user is working in a first language, but receives a message in a second, different language. In this case, it makes sense for the user to reply to the sender in the language of the received message. Currently, the user has to change the IME settings manually before composing a reply message in the second language. This takes a number of steps which makes it slow and cumbersome for the user to respond.
The present invention aims to address this problem.