Conventional text data can be replaced with predetermined codes on the basis of a code assignment table of the ASCII code and Unicode. FIG. 30 is a drawing for explaining a conventional code assignment table based on the ASCII code and Unicode. As illustrated in FIG. 30, predetermined control characters are set in 00h to 1Fh in the code assignment table, and a one-byte code (hereinafter, “1-byte code”) is assigned to each of the control characters. Alphanumeric characters are set in 20h to 7Fh in the code assignment table, and a 1-byte code is assigned to each of the alphanumeric characters. Further, CJK characters are set in 80h to FFh in the code assignment table, and a three-byte code (hereinafter, “3-byte code”) is assigned to each of the CJK characters.
In this regard, in Japanese Laid-open Patent Publication No. 07-287716 (hereinafter, “conventional example 1”), a technique is described by which, when there is a free region in the range from 00h to 1Fh to which control characters are assigned in a code assignment table, words and the like are registered into the free region, so that an encoding process is performed by using the code assignment table arranged in that manner. Further, in Japanese Laid-open Patent Publication No. 11-143877 (hereinafter, “conventional example 2”), another technique is described by which, in a region for the English capital letters in a code assignment table, other characters are set in place of the English capital letters, so that an encoding process is performed by using the code assignment table arranged in this manner.    Patent Document 1: Japanese Laid-open Patent Publication No. 07-287716    Patent Document 2: Japanese Laid-open Patent Publication No. 11-143877
However, the conventional examples described above have a problem where it is not possible to assign short bytecodes to words of which the frequency of appearance is high and general symbols.
For example, only when people who transmit and receive text data to each other share the unused control characters or the English capital letters and the code assignment table therefor, it is possible to assign short bytecodes to the characters and words of which the frequency of appearance is high, by assigning the words to the free region for the control characters or the like, as described in conventional examples 1 and 2 above.
In contrast, when variable-length codes are assigned to words and general symbols included in general text data, depending on the frequency of appearance thereof, the code length of approximately 40 types of words and general symbols is in the range of five to eight bits, whereas the code length of approximately 8,000 types of words and general symbols is in the range of nine to sixteen bits. Thus, by assigning a 1-byte code to each of 32 or more types of words and general symbols and assigning a 2-byte code to each of 8,192 or more types of words and general symbols, depending on the frequency of appearance thereof, it is possible to implement a compressing process that can achieve a high compression ratio. However, according to conventional examples 1 and 2, it is not possible to assign codes to a large number of words and general symbols.