Current methods of compressing data involve trade offs between speed, compression ratio, and complexity; the data structures required for compression methods can be complex and require substantial computational and storage overhead. Some compression methods require that an entire collection of data be available in a single block of data and the data must be stable while compression is being performed. Other compression methods can produce output which requires greater storage space than would the uncompressed data, thereby rendering the compression effort wasted. Still other compression processes result in compromises between accurate data representation and compression ratio.
One method of compressing data is run-length encoding, which does not present the problem of compressed data size exceeding input data size. However, many forms of input data will not result in significant compression because they have long sections with no repeated values. For example, a ramp will not compress well with a run-length encoding scheme. Further, noise can seriously interfere with run-length encoded compression when data resolution is greater than 10 bits, unless the algorithm is modified to run-length encode nearly identical values as being equal. Such modification results in a non-identical reconstruction of the original data when compressed data is uncompressed.
Another compression method is a quadtree; a square two dimensional array is recursively compressed by (1) if array size is one, representing the array as a single non-compressed value, (2) otherwise, if all values in the array are near-same, representing the array as a single compressed value and (3) otherwise, dividing the array into four subarrays and compressing each of the subarrays. Drawbacks to this method include requiring the array to be stable during compression and requiring the entire array to be available in a single block of memory. Further, the data structures involved are complex and require computational and storage overhead, some arrays will compress to a form requiring greater storage space than the original uncompressed array and a compromise must be made between compression ratio and faithful reproduction of the input data upon decompression.
The Limpel-Ziv-Welch (LZW) algorithm, as described in "A Technique for High-Performance Data Compression" in IEEE Computer, June 1984, pages 8-19 compresses data which will tend to have repeated sequences. The algorithm creates a dictionary while processing data, creating single-word codes for repeated sequences. A drawback to such an algorithm is that the data word size of the compressed sample must exceed the data word size of the original non-compressed data. Further, input data which is near-random will compress poorly and system memory requirements may be large, as memory sufficient to hold the generated dictionary and a LIFO stack for decompression will be required. As with other methods, data sequences which are near-random will compress poorly.
Thus, it would be desirable to have a compression method which will faithfully compress near-random or high frequency input data and will identically reconstruct the input data when decompressed, while still being simple to implement, with reasonable compression ratios and low memory space requirements.