A recent information apparatus is capable of accessing a removable medium without via a personal computer (hereinafter acronymized as “PC”). For example, a multifunction peripheral (MFP) having a port for access to files in a removable medium and a function of inputting and outputting images has appeared on the market.
Each of the files is stored in the removable medium, always with a file name assigned thereto. The file name consists of a plurality of characters. Each of the characters is represented by a character code of one or more bytes.
In general, a character code system used in UI (user interface) display or within an information apparatus (hereinafter referred to as “the internal code system”) is selected from character code systems widely used in each country. In Japan, for example, SJIS (Shifted JIS Code) or EUC (Extended UNIX (registered trademark) Code) is generally used.
On the other hand, as for file names and external codes distributed all over the world, it is preferred, in view of compatibility, that they can be commonly handled anywhere in the world, and therefore Unicode or ASCII code is generally used as an external code system. For example, the Microsoft FAT file system as a file system for handling files within a removable medium also employs Unicode for long file names. For this reason, files on a FAT file system basically can be properly accessed in any country.
Thus, in general, an information apparatus selectively uses an external code system and an internal code system, and hence the information apparatus is required to have a mutual conversion function for conversion between the two character code systems so as to properly access a file within a removable medium. This mutual conversion function for conversion between character codes requires a huge character code conversion table so as to support a language having a system to which a fixed mechanical conversion rule cannot be applied. Further, in an information apparatus, it is required to set forbidden characters and perform interpretation of characters, according to each internal code system.
Now, basic character code conversion processing will be described with reference to FIG. 14. FIG. 14 exemplifies a case where two characters “A (half size)” and “O (full size)” are transferred between an MFP main unit 101 and a removable medium 105. Further, in FIG. 14, it is assumed that SJIS is used within the MFP main unit 101 as an internal code system, and Unicode (UTF-16) is used for the removable medium 105 as an external code system.
The character “A (half size)” on the removable medium 105 is represented by 2 bytes of “0x0041” as denoted by reference numeral 107. In order to use this character code in the MFP main unit 101, it is required to perform external code-to-internal code conversion (see reference numeral 102). In this case, the MFP main unit 101 refers to an internal code/external code conversion table 103 to thereby convert the external code “0x0041”, denoted by reference numeral 107, of Unicode (UTF-16) into the internal code “0x41”, denoted by reference numeral 106, of the SJIS.
Similarly, the MFP main unit 101 refers to the internal code/external code conversion table 103 to thereby convert a Unicode character “O (full size): 0xFF2F” into an SJIS character “O (full size): 0x826E” (see reference numerals 109 and 108).
Further, the conversion between the internal code system and the external code system is required to be performed in a reversible manner, and in the case of writing a file processed in the MFP main unit 101 into the removable medium 105, the MFP main unit 101 converts internal codes representative of the name of the file into external codes representative of the same (see reference numeral 104).
Next, a description will be given of problems of a case where the above-described character code conversion processing is not executed. For example, in the case of “0x0041” representative of “A (full size)”, a character string including the character comes to include “0x00”. In the handling of a character string in general information processing, “0x00” is used as a termination code. Therefore, there is a possibility that such a character string including the termination code is broken at an intermediate point.
On the other hand, in the case of “0xFF2F” representative of the character “O (full size)”, it includes “0x2F” as a second byte. In SJIS, “0x2F” in one byte represents “/”, and hence when a character string including this character code is used as a file name, the file name can be erroneously recognized as a different file name. To have a character string broken or erroneously recognized means that it is impossible for the information apparatus to properly access a file based on a file name.
Therefore, particularly for properly handling multi-byte character strings in an information apparatus, execution of the character code conversion processing and the use of the character code conversion table are indispensable. However, the internal code system differs from language to language, and hence character code conversion tables are needed which correspond in number to the number of languages to be supported by the information apparatus.
Among information apparatuses, PCs, which have abundant available memory resources, are capable of storing, in advance, character code conversion tables associated with respective various languages as part of an OS (Operating System). However, it is practically impossible for MFPs, which have scarce available memory resources, to store character code conversion tables associated with respective various languages.
To cope with this problem, a technique disclosed in PTL (Patent Literature) 1 mentioned below has been developed. According to PTL (Patent Literature) 1, character codes extracted from input data are converted into character codes of another character code system by referring to a code conversion table, and are then output.
According to the invention disclosed in PTL (Patent Literature) 1, in the case of handling an undefined character code, i.e. a character code unconvertible using the code conversion table, the original character code is converted into a substitute code and then output while storing the original character code and an output position in a table. Thus, the invention disclosed in PTL (Patent Literature) 1 makes it possible to eliminate the need to prepare code conversion tables associated with all languages (character code systems) to thereby contribute to resource saving.
However, in the method used in PTL (Patent Literature) 1, with an increase in the number of undefined character code systems, the frequency of use of a substitute code is increased, which increases the probability of duplication of file names. This can eventually cause files incapable of being properly accessed to eventually appear.