The use of data compression or "coding" schemes to increase the storage capacity of storage media (e.g., tape drives, hard drives, etc.) is well known in the art, and can result in significant increases in data storage capacity. However, the efficiency with which data may be compressed depends on the specifics of the compression scheme employed and the type of data compressed. Depending on data entropy, certain data types may be incompressible or inefficiently compressible by the compression scheme, and may cause the data to occupy more memory space than when the data is in an uncompressed format (i.e., data expansion). For example, in many implementations of Lempel-Ziv 1 coding including IBM's adaptive lossless data compression (ALDC), LZS (QIC 122), etc., highly random data can expand in size up to 12.5% (e.g., from 60,000 bytes uncompressed to 67,500 bytes compressed).
When data expansion occurs during data compression, the very purpose of performing data compression (e.g., to increase the storage capacity of a storage media) is subverted. Accordingly, a need exists for a method and apparatus for reducing the expansion of data during data compression.