I. Field of the Disclosure
The technology of the disclosure relates generally to data compression techniques, and more particularly to Lempel-Ziv (LZ)-based data compression techniques.
II. Background
Data compression is used in processor-based systems to reduce data size. Reducing data size can reduce memory needed to store a given amount of data. Reducing memory size can thus also lower the area needed for memory circuits in an integrated circuit and lower costs of the integrated circuit as a result. Reducing memory size can also reduce static energy by reducing on-chip retention power needed to retain data in memory. Data compression can also reduce the amount of data transfer bandwidth needed for reading data from and writing data to memory, thus increasing performance and reducing dynamic energy expended in transferring data.
One technique used to compress data is Lempel-Ziv (LV)-based compression. LZ-based compression is a lossless data compression technique. FIG. 1 illustrates an example of an input data block 10 before compression, and after being compressed into an output data block 12 using an LZ-based compression technique. As shown in FIG. 1, the LZ-based compression captures a repeated data pattern 14R (e.g., AABC) of data 14 in the input data block 10 to be compressed. The captured repeated data pattern 14R in the input data block 10 is reduced to a reduced size length and distance block 16 (e.g., [4, 12]) from the current point that the repeated data pattern 14R appeared elsewhere in the input data block 10. In this manner, when the output data block 12 is decompressed, the length and distance block 16 can be replaced with the true pattern (e.g., AABC) at the distance and length in the output data block 12 to recreate the uncompressed input data block 10.
Huffman coding can be applied on top of LZ-based compression to achieve improved compression ratio. In Huffman coding, the length and distance blocks in an LZ-based compression output data block can be reduced in size by replacing the length and distance blocks with a specific prefix code that is not otherwise used in the original data to reduce the encoding of distances. Data patterns that are more frequently repeated are assigned the smallest prefix codes, with less repeated data patterns being assigned larger prefix codes, and so on, to reduce the size of the prefix codes coded in the output data. For example, a Huffman prefix code may be a value from ‘1’ that can be stored in 1-bit to a very large number (e.g., 4096) that requires multiple bits to be stored. In Huffman coding, a Huffman tree is generated from the exact frequencies of distance for length and distance blocks in the compressed data to store the correlated assigned prefix codes used in encoding the distances. The Huffman tree may be large in size if the number of unique distances present in the LZ-based length and distance blocks is large. Thus, when decompressing data that was compressed and Huffman-coded, the Huffman tree is consulted to decode the prefix codes encoded in the length and distance blocks back to distances to provide the original LZ-based length and distance blocks. The LZ-based length and distance blocks are then decoded during decompression to recreate the original compressed data in the output data.
While Huffman coding can result in further reduction in data size of LZ-based compressed data, Huffman coding can be costly in terms of power consumption and latency. Additional latency is involved in assigning and storing prefix codes for length and distance blocks during data compression in the Huffman tree, and retrieving length and distance blocks from prefix codes from the Huffman tree during data decompression.