When character code conversion is performed on compressed data, the conversion is generally performed in two passes. The first pass is expansion processing and the second pass is character code conversion processing (for example, refer to Japanese Laid-open Patent Publication No. 2003-30030). This process requires a storage area that is prepared to store therein the result of the expansion processing.
As compression and expansion algorithms, ZIP that uses LZ77 is widely used. In ZIP, a sliding window is used to determine the longest matching character string of a character string to be compressed, to generate compressed data. The sliding window is also used to determine the longest matching character string of the compressed data to be expanded, to generate expanded data. The longest matching character string is determined by using a sliding window in the unit of byte.
There is a known technology of generating compressed data by converting a character string to be compressed into a compression code assigned to a Japanese word or a Chinese, Japanese, or Korean (CJK) character in a static dictionary, by using the static dictionary.
Patent Document 1: Japanese Laid-open Patent Publication No. 2003-30030
However, when character code conversion is performed on data obtained by expanding compressed data, the unit of data output from the expansion processing of the compressed data differs from the unit of data on which the character code conversion processing is to be performed. To address this, the entire compressed data is developed by expansion processing first, and then character code conversion processing is performed on the developed data, as a separate process. Consequently, as an example, there is a problem in that the storage area is wastefully used. As another example, there is a problem in that the processing takes too much time.
For example, Japanese words or CJK characters, which are characters of Chinese, Korean and Japanese, in a specific character code system, are registered in a static dictionary used in the conventional technology. Japanese words and CJK characters registered in the static dictionary are converted into compression codes assigned thereto, to perform compression processing. In such a case, as illustrated in FIG. 1, in expansion processing, an expansion tree corresponding to the static dictionary is used to expand the entire compressed data, and the expanded data, which is entirely expanded, is stored in the storage area. In the character code conversion processing, the character code of the entire expanded data stored in the storage area is converted, to generate converted data. As a result, in the expansion processing, the expansion result of the entire compressed data is stored in the storage area. Consequently, the storage area is wastefully used. Also, processing time of the expansion processing and the character code conversion processing takes too much time.