To efficiently store data and/or transfer data between networked computing devices, data compression algorithms are often used. Two common lossless compression algorithms include Lempel-Ziv 78 (LZ78) and Lempel-Ziv-Welch (LZW). Both the LZ78 compression algorithm and the LZW compression algorithm are dictionary coders that use previously received characters in an input stream to encode subsequent characters in the input stream.
The LZ78 algorithm starts with an empty table of a fixed size (e.g., capable of holding 4096 string entries). As new characters and strings are encountered in the input stream, new string entries are added to the table, each string entry having a unique index value. When those same characters and/or strings are encountered later in the input stream, the characters/strings are replaced with the index value for a matching string entry. In LZ78, since the table has a fixed size, the index values also have a fixed size. For example, if the table has a fixed size of 4096 entries, each index value is 12 bits in length, regardless of the number of entries currently in the table. In LZ78, once the table fills up, all entries in the table are deleted. String entries can then continue to be added to the table until it again fills up. This process introduces an inefficiency, in that the table needs to be rebuilt from scratch each time it becomes full.
The LZW algorithm starts with a table that is preconfigured with a separate string entry for each American Standard Code for Information Interchange (ASCII) character (256 string entries). In the LZW algorithm, as new strings are encountered in the input stream, the size of the table grows, and new string entries are added to the table, each string entry having a unique index value. When those same characters and/or strings are encountered later in the input stream, the characters/strings are replaced with the index value for a matching string entry. As the size of the table grows, the number of bits needed to uniquely represent an index value increases. Once the table grows to a predetermined size (e.g., 65,535 string entries), the string entries in the table are deleted, and the table starts over at its preconfigured size with its preconfigured string entries. As in the LZ78 algorithm, this process introduces an inefficiency, in that the table needs to be rebuilt from scratch each time it becomes full. In LZW, since the table has a preconfigured initial size, the index values have a minimum size of 8 bits.
Both the LZ78 and the LZW algorithms generate an output stream that consists of alternating index values and literal values of characters from the input stream. An example output stream would have an index value, a literal value, an index value, a literal value, etc. in sequential order. This makes it difficult to apply additional compression techniques to the output stream to further compress it, or to perform other post processing of the output stream.