In a word processing document, each letter of the alphabet (or each symbol of an ideograph system, such as Japanese Kanji) is mapped to a different numeric value when stored in a file. The specific one-to-one mapping relationship employed is referred to as a code page. The Roman alphabet and common symbols associated with it comprise less than 200 different characters. Each of these characters can be represented by a single byte (integers ranging from 0 through 255). Although there are accepted standards that address the mapping between these integer numbers and the characters, these standards are not universally followed. For example, over 100 symbols are assigned to different numbers in the two code pages respectively used by MICROSOFT.TM. WINDOWS.TM. operating system and by the SYSTEM 7.TM. operating system used by APPLE.TM. MACINTOSH.TM. computers.
When pasting text into a document, it is relatively simple to translate text encoded using the code page of another (foreign) system to that of the native word processing system, so that the imported text can be displayed and edited. However, some of the symbols in the foreign code page may not have a corresponding symbol in the native code page. Between the WINDOWS.TM. and MACINTOSH.TM. operating systems, there are about 15 characters that have no code page equivalent.
If text containing any character for which there is no corresponding code page equivalent is pasted into a document by a word processing system, such characters will either not be displayed or will be replaced by an unintelligible character. If the document's text is encoded using the native code page and saved, characters that do not have an equivalent representation in the native code page will be lost or changed. Subsequently, if the document that includes the imported text is exported back to the operating system on which the imported text was created, the document will not include the characters that were not properly translated. Conventional word processing systems do not employ any method for handling imported text that will ensure the foreign code page representation of characters is correctly retained. As a result, the prior art is unable to properly move text with different code pages back and forth between word processing systems running on different types of computers.