Contemporary operating systems, such as Microsoft Corporation's Windows® 2000 or Windows® XP operating system, provide application programs with the ability to work with the Unicode character set, which is a single binary character set for essentially all languages that is consistent world-wide. With Unicode, a character as viewed from the user's perspective has only one value (or unique combination of glyphs' values) that represents it in a computer system, whereby application programs' user interfaces input and output the exact characters that are intended by the user and the application program, independent of any particular language settings. For example, the Latin character capital letter “D” is represented by 0044 hexadecimal, whereas the Greek capital letter “Δ” (delta) is represented by 0394 hexadecimal; each glyph has a unique Unicode value. This is true even for characters that look the same in a given font, e.g., the Latin character capital letter “A” is represented by 0041 hexadecimal, whereas the Greek capital letter “A” (alpha) is represented by 0391 hexadecimal.
For application programs that are not developed to work with the Unicode character set, referred to herein as script-dependent or codepage-based application programs, contemporary operating systems having natural language support provide a mechanism to convert the character set of the applications to Unicode and back, as set by the administrator of the system. For example, a computer system can be set to use a Latin-based character set (i.e., Latin-based script), Greek, Cyrillic, Arabic, Hebrew and so forth. Depending on the character set that is active, a character having a single value from the perspective of an application program may, when converted, have different values to the operating system and thus, for example, appear differently to the user at different times.
To set the operating system to use a particular character set, a system locale variable is provided that an administrator can set, which the system then uses to determine the current language setting. More particularly, in some operating systems that support natural languages, the current value of the system locale variable determines which (if any) codepage is active, wherein a codepage comprises an internal table that the operating system uses to map symbols (e.g., letters, numerals, punctuation characters, glyphs and so on) to a number, such as a Unicode value. Different codepages thus provide support for the character sets used by different languages, e.g., the codepage identified by the value (Windows®) 932 represents the Japanese (Kanji) character set (and also supports hirgana and katakana character sets), while Windows® codepage 950 represents one of the Chinese character sets. As a result, the encoding value 95 hexadecimal will be converted to one Unicode character when one codepage is active, for example, and a different Unicode character when a different codepage is active. Note that the codepage is not necessarily per-language, as a given character set may support more than one language, e.g., the same Cyrillic codepage may be used for Russian and Serbian languages, which share the same script.
In general, the conversion to and from Unicode is thus accomplished on a system-wide basis via the operating system, which converts non-Unicode (script-dependent) characters to Unicode characters and vice-versa via the codepage table that is currently active, as determined by the system locale variable. In this manner, the system locale setting on a given machine enables programs that do not support Unicode to display menus and dialogs in their native language by installing the necessary codepages, and fonts.
However, while such operating systems thus support different languages, there are a number of problems with this present language support mechanism. In particular, non-Unicode programs designed for one system locale setting (and its corresponding codepage) will not work as intended with another system locale setting (and its corresponding codepage, which is different). For example, with an incorrect system locale setting, such a non-Unicode, script-dependent program will display characters that are meaningless (incomprehensible) relative to its intended language.
In many situations, the problem of having a mismatched active codepage for a given application program cannot be solved simply by changing the system locale setting. For one, in contemporary operating systems, not every user is authorized and/or capable of changing the system locale, as this requires administrator privileges and a certain level of familiarity with the operating system. Most corporations, educational institutions and other entities do not let users operate their systems as administrators. In such situations, an administrator must be found to change the system locale for a user, even if the user only wants the setting changed temporarily. Even with an administrator-level user or an administrator conveniently present, changing the system locale requires a reboot, which is inconvenient at best, and can also lead to lost data and other problems.
Another significant problem is that the system allows only one system locale to be set in the system at a time. As a result, (excepting the ASCII character set, which is shared by codepages for the lowest 128 characters (0 to 7F hexadecimal)), a user cannot properly run two programs that each have different languages at the same time, e.g., the system can have the Japanese codepage active, or the Russian codepage active, but not both codepages active at the same time. Thus, one of these two programs will not function properly, depending on which codepage is active.