1. Field of the Invention
The present invention relates in general to computers, and provides exemplary embodiments for converting bit lengths into codes. Particularly, the present invention relates to mechanisms for converting plural bit lengths, each assigned to plural strings, into plural codes respectively having the plural bit lengths.
2. Description of the Related Art
Deflate compression is a compression method based on which data compression formats widely used in computers at present, i.e., zlib and gzip (GNU zip) are specified. In this method, data is compressed by use of a coding technique called Huffman coding. In Huffman coding, byte-based characters repeatedly appearing in data are assigned variable-length codes in accordance with frequencies of appearance of the respective byte-based characters. In this regard, more efficient coding can be carried out by assigning codes of shorter bit lengths to more frequently appearing characters, and codes of longer bit lengths to less frequently appearing characters.
In Huffman coding, a Huffman table that retains codes assigned to respective characters is produced, and added to compressed data. Then, the compressed data is decoded with reference to this Huffman table. However, in the case where a Huffman table is thus added to compressed data, a compression rate drops if the size of this Huffman table is large. In order to improve the compression rate, this Huffman table itself is compressed in deflate compression.
Specifically, in deflate compression, a Huffman table does not retain codes respectively assigned to characters, but retains lengths (bit lengths) of the codes assigned to the characters. If the ASCII code sequence is employed to define which codes to assign to respective characters in a group with the same bit length (that is, the alphabetical order is employed when characters to be coded are alphabets), the codes assigned to the characters can be uniquely determined based on the bit length. For example, let's consider a case where bit lengths assigned to A, C and D are all “3”. In this case, once a code of “A” is determined as “100”, C and D can be determined as “101” and “110” by incrementing the immediately preceding code by “1”.
Heretofore, several techniques relating to decoding of data coded by use of a Huffman table have been proposed. A first technique described in the literature proceeds as follows. First, it is determined whether codes to be decoded are coded by using a standard Huffman table or a nonstandard Huffman table. Then, if the codes are determined as coded by using the standard Huffman table, a standard decoding circuit decodes the codes by using the standard Huffman table, whereas, if the codes are determined as coded by using the nonstandard Huffman table, software processing means decodes the codes by using the nonstandard Huffman table. A second technique described in the literature provides a method for performing Huffman decoding in reduced time by testing for the length of valid Huffman codes in a compressed data stream, and using an offset corresponding to a test criterion.