1. Field of the Invention
The present invention relates generally to character string processing apparatuses, character string processing methods, and image-forming apparatuses, and more particularly to a character string processing apparatus, a character string processing method, and an image-forming apparatus that convert a character string encoded by an encoding method (a character code set) to a character string encoded by another encoding method.
2. Description of the Related Art
Encoding methods convert characters and signs (hereinafter referred to simply as characters) to their respective character codes assigned thereto so as to handle the characters on a computer. Normally, different encoding methods are used depending on languages or computer systems. The Internet, for instance, employs UTF-8 or UTF-16 using Unicode as a standard encoding method so as to support the world's major languages. Further, character string processing apparatuses and image-forming apparatuses employ Shift-JIS or Latin1 as an encoding method.
The character string processing apparatuses, which normally can use a plurality of encoding methods, convert a character string encoded by one encoding method to a character string encoded by another encoding method as required. The image-forming apparatuses have a small number of encoding methods as necessary and sufficient prepared in a user interface in accordance with the language of their purchaser so as to save the capacity of a font ROM.
Conventionally, a character string processing apparatus or an image-forming apparatus connected to a network such as the Internet, when receiving a request including a character string represented in, for instance, Unicode (such as a request to change a document name) from the network side, converts the character string to a character string encoded by an encoding method used in internal processing.
Japanese Translation of PCT International Application No. 11-512543 discloses a technique for converting a character string encoded by one encoding method to a character string encoded by another encoding method.
Normally, the character string processing apparatuses and the image-forming apparatuses can use a plurality of encoding methods, and accordingly, are required to select an encoding method to use.
Representable character sets, however, differ among encoding methods. Therefore, a character that is representable by an encoding method before conversion is not necessarily representable by another encoding method after the conversion. Accordingly, there is a problem in that it may not be possible to convert all character strings completely, depending on the combination of the encoding methods before and after the conversion.
For instance, a character set representable by an encoding method employed in the internal processing of a character string processing apparatus or an image-forming apparatus, such as Shift-JIS or Latin1, does not necessarily include all of the characters of a character set representable in Unicode. Accordingly, even if a character is representable in Unicode, the character is not necessarily representable by an encoding method employed in internal processing.
Thus, the selection of an encoding method is important to the conventional character string processing apparatuses and image-forming apparatuses because the number of inconvertible characters differs depending on which encoding method to select to convert a character string encoded by another encoding method. Further, to the conventional character string processing apparatuses and image-forming apparatuses, the handling of characters that have failed to be converted is also important in the case of, for instance, collating the converted character string. Furthermore, some encoding methods assign different character codes to a single character, and the handling of such an exceptional character is also important.