1. Technical Field
The present invention relates generally to a computer implemented method, data processing system, and computer program product for converting and rendering text. More specifically, the present invention relates to detecting and converting a text from at least one encoding to another, and then rendering the text to characters readable by users.
2. Description of the Related Art
A character is a written form of a language. Many characters correspond to specific sounds in a language. A character can be, for example, a letter, punctuation, a number, or a mathematical symbol. A code set or coded character set is defined as a set of rules that defines a character set and the one-to-one relationship between each character and its bit pattern. A code set defines the bit patterns that a data processing system uses to identify characters. Examples of code sets are ISO-8859-1, UTF-8, UTF-16, UTF-32, GB 18030, and big5.
Users can use a conventional computer to read names and contents of files and directories. However, because the names and contents can be created and encoded by different users under different code set environments, it is possible that the names and contents are encoded in two or more code sets. Conventional file system navigational tools and file content viewers and editors are geared to show a single code set or mapping from the native bits of the file to pixel representations of each character. As a consequence, conventional file system navigational tools will render correctly on a screen only those names of files and directories that are encoded in the single code set, and file content viewers and editors will render correctly on a screen only those contents that are encoded in the single code set.