A Lempel-Ziv compression technique searches for recurring data patterns in a stream of bytes. However, performing the matching at all bytes of the stream is slow. A conventional approach to improve the compression throughput uses chains of hash values. Hash chains help the compression technique process sequences with the same hash value to find potential matches. A symbol run present in the stream generates a long hash chain that slows the compression.
Referring to FIG. 1, a diagram of a portion of a conventional hash chain 10 is shown. The diagram illustrates a portion of the normal hash chain 10 relative to sequential locations 0-C. Each location 1-4, 8 and 9 contains a given hash value (white). Locations 0, 5-7 and A-C contain different hash values (shaded). The normal hash chain 10 is created by setting pointers in each given hash value locations 1-4, 8 and 9 to a nearest previous given hash value location. Therefore, long byte runs create long hash chains having many pointers for the compression technique to consider.
It would be desirable to implement a method to shorten hash chains in Lempel-Ziv compression of data with repetitive symbols.