As the amounts of data used increases, the amount of storage space for storage of data, and the amount of bandwidth for transfer of data also increase. Compression is commonly used to reduce requirements for storage and/or transfer bandwidth. For compression of text, the use of the Lempel-Ziv-Welch (LZW) algorithm is very common. LZW traditionally works by generating a fixed-size string translation table that maps codes to strings. The algorithm is assumed to be understood, and will not be described in detail herein.
LZW traditionally constructs the string translation table, or a table of strings, from strings that have been encountered or read in the input data stream. The compressor outputs an index, being the fixed-size code, for certain characters. The indexes are output using only as many bits as would be required (for example, if there are 500 strings in the table, the indexes would only use 9 bits each). Unfortunately, LZW has a maximum table size, and once the maximum number of entries has been used in the table, the entire contents of the table are discarded. The compressor then starts over with loading the table with encountered strings.
Especially for large input streams, the discarding of the string translation table may be inefficient. The table may include a number of strings that have occurred multiple times in the input stream, and may again occur many times through the remainder of the input stream. However, current LZW implementations have no way to distinguish one entry from another, and all entries are discarded, regardless of potential importance.