A 2N-branch nodeless Huffman tree having a compression code length specified by an integer multiple of 2, 3, and 4 is conventionally generated by development from a node aggregate acting as a generation source of 4-, 8-, and 16-branch Huffman trees (see, e.g., Japanese Laid-Open Patent Publication No. 2010-093414).
However, since a 2N-branch nodeless Huffman tree has a compression code length that can only be specified by an integer multiple of 2, 3, and 4, an exponent N of a maximum branch number 2N can inevitably only be specified by an integer multiple of 2, 3, and 4. Therefore, the maximum branch number 2N is 4 (=22), 8 (=23), 16 (=24), 64 (=26), 256 (=28), 512 (=29), 1024 (210), 4096 (212), or 16384 (=214). Therefore, a number other than an integer multiple of 2, 3, and 4 such as 2048 (=211) and 8192 (=213) cannot be specified as the maximum branch number 2N.
On the other hand, if the number of types of single characters, basic words, and reserved words (hereinafter referred to as character data) making up text data is greater than 2m (where m is an integer multiple of 2, 3, and 4) and less than or equal to 2m+1 (where m+1 is not an integer multiple of 2, 3, and 4), the maximum branch number 2N must be N≧m+2. For example, if the number of types of the character data (hereinafter referred to as the number of character data types) is greater than 210 and less than or equal to 211, the maximum branch number 2N must be 212. If the number of character data types is greater than 212 and less than or equal to 213, the maximum branch number 2N must be 214. Therefore, a size of the 2N-branch nodeless Huffman tree becomes larger, which has room for improvement.