Systems such as the Apple.RTM. Macintosh.RTM. or Microsoft.RTM. Windows (TM) overload the meaning of their character codes. For example the code 0.times.53 might represent the roman character `S` or the Greek character `S` depending upon the context. This overloading of character codes creates problems when clients need to edit text with multiple scripts and symbols. A script is a collection of character codes that have meanings that are semantically related. Usually a language maps into a single script. For example, encodings for Roman, Greek and Hebrew are all separate scripts.
Almost all commercial systems overload the semantics of their character encoding. Some systems do so with code page architectures, others do it simple by changing a font--which is similar to specifying a code page. Code pages are common to IBM systems. The context of a character depends upon the font or code page currently in use. Most systems can't even handle text from multiple scripts. Many that can, can only handle text in at most two scripts. Most systems also cannot detect, when a character is entered, if that character exists in the current font. Thus, someone might think that they were entering a `.SIGMA.` but the character might display as a `S` depending upon the current font. This is because the system cannot distinguish what a character's semantics is based solely on the character code. Some systems try to deal with the problem at the user interface level by associating a keyboard with a specific font. However, this type of solution does not work when characters are programmatically entered without the use of a keyboard.
There is another problem caused by the overloading of character codes. When a client tries to apply a font style to a string of characters from different scripts, the characters may become garbled. For example, text might contain a simple equation like ".SIGMA. n/2". A user might want to change the font on the text to something a little bolder looking like a Chicago font. The user would select the equation, change the font, and the result would look like "S n/2". The problem is even worse when the selection contains characters from many different scripts.
Both of these problems are inherent to systems that overload character codes. Systems that are based on an universal encoding, where all characters have unique codes, have the opportunity to work around these problems. However, to date, these systems have not effectively dealt with the problem. Systems that use universal character encoding, such as a system produced by Xerox, generally only indicates that a character is missing from the current font. For example, the sigma would display as a missing glyph rather than mapping it to some random glyph. A glyph is a visual representation of a character code. Prior art code page based systems would map the character to a random glyph.