1. Field of the Invention
This invention relates to systems and methods for compression of data.
2. Background of the Invention
Modern lossless data compression is a class of data compression algorithms that allow the original data to be perfectly reconstructed from the compressed data. By contrast, lossy data compression permits reconstruction only of an approximation of the original data, while this usually allows for improved compression rates.
DEFLATE is a lossless data compression algorithm that uses a combination of the LZ77 algorithm and Huffman coding. It was originally defined by Phil Katz for version 2 of his PKZIP archiving tool and was later specified in standard RFC 1951. DEFLATE has widespread uses, for example in GZIP compressed files, PNG (Portable Network Graphic) image files and the ZIP file format for which Katz originally designed it.
DEFLATE compression is very well understood and the source code for programs like GZIP are readily accessible in the public domain. The algorithm to implement compression is relatively complex to do in hardware for high bandwidth application given that the GZIP compression is based on the LZ77 algorithm and Huffman coding.
The objective of compression is to use copy commands later in a data stream that appears earlier in the data stream. As such all compression implementations require a search history buffer and some type of compare length function to determine the longest length that can be used for the copy command. One efficient implementation for search matching in the previous history is to match upon a hash chain, which is built on hash map of three-byte string.
Among the abundant lossless compression algorithms, DEFLATE compression achieves a great trade-off between hardware complexity and compression rate. For example, in GZIP, the data is hashed and compared to generate a copy or literal command per the Lempel-Ziv algorithm or comparable algorithm. Once the statistics for the data to be compressed are gathered, they are Huffman encoded and then compressed to be sent out.
However, some host data is uncompressible by nature. For instance, video and audio data are often already compressed using lossy compression algorithms; encrypted data is not compressible; compressed data is not or hardly compressible.
The methods disclosed herein provide an improved approach for compressing data, such as using the DEFLATE algorithm, by detecting uncompressible data based on attributes of the data itself.