Data compression techniques include Huffman coding, LZ77 coding, and LZ78 coding. Improvements of LZ77 coding includes LZSS coding. Also available are compression algorithms in ZIP format or LZH format, in which data compressed through LZSS is further compressed through Huffman coding.
In the LZ77 coding or LZSS coding, a dictionary is generated using a method referred to as a sliding dictionary method. In the sliding dictionary method, a buffer area called a sliding window is used. A character string is stored as an encoding target in the sliding window in the reading order thereof. When the sliding window becomes full of character strings, the oldest character string is discarded first.
The area of the sliding window is partitioned into a reference region and encoding region. A character string stored in the reference region is used as a dictionary, and a character string stored in the encoding region is encoded. The character string thus encoded is stored on the reference region. In encoding, a character string (the longest match character string) that has the longest match pattern with a leading character string in the encoding region is searched for in the reference region. The character string in the encoding region is encoded into values indicating a distance (address) from the front of the sliding window to the front of the longest match character string and a length of the longest match character string. A high compression rate is thus achieved. If the length of the longest match symbol string is shorter than three characters, the leading character of the encoding region is output in binary notation (such as American standard code for information interchange (ASCII) code).
In ZIP, or LZH, a leading address, length, and binary notation of a character, each encoded, are compressed using a Huffman tree. The compression rate is even more increased.
A technique available in the Huffman coding converts the Huffman tress into a nodeless tree, thereby increasing the compression efficiency of character code and allowing a compress process and a decompression process to be performed at a high speed.
Integers of an address and a length resulting from encoding through the related-art sliding dictionary method are not values optimized for an encoding method that is used in the encoding of the integers. Even if an integer resulting from encoding through the sliding dictionary method is encoded using the Huffman code, no sufficient compression rate results.