This invention relates to compression of data.
In dictionary-based compression techniques, a codebook or dictionary is generated during compression—the codebook assigns a unique code to a sequence of uncompressed data. Generally, the codebook must be stored along with the compressed data; otherwise a decompressor would have no way of knowing what the codes represent. In contrast, in LZW (Lempel-Ziv-Welsh) compression, the compressor and decompressor build identical codebooks as data is processed sequentially, thus avoiding the need to store or transmit a codebook. The compressor outputs a pattern code only after it has found the pattern more than once. The first time the compressor processes a sequence of data, it places that sequence in its codebook and outputs the sequence without any encoding. The decompressor will receive this sequence and place it in its codebook. The compressor, when it sees a pattern repeated for a second time, outputs the code from its codebook for the pattern. The decompressor can recognize the code because it has built an identical codebook from the previous sequences of data.
The LZW compression technique works well on a variety of data. The Unix compress utility and the personal computer ARC utility are based on LZW compression. Additionally, the Graphical Interchange Format (GIF), which is a popular palettized image compression format used in the World Wide Web (WWW), uses the LZW algorithm.