Software data decompression is an important software operation used in many computing applications, including both server and client applications. Many common lossless compression formats are based on the LZ77 compression algorithm. Data compressed using LZ77-based algorithms typically include a stream of symbols. Each symbol may include literal data that is to be copied to the output or a reference to repeat data that has already been decompressed. Compared to other lossless compression algorithms such as DEFLATE, LZ77-based algorithms typically achieve lower compression levels but provider higher performance, particularly for decompression. One typical LZ77-based format is “Snappy,” developed by Google Inc. and used by the Apache Hadoop™ project and others. Other LZ77-based formats include LZO and LZF.
Typical implementations of decompression algorithms include numerous conditional branches used to categorize input symbols. The outcome of those conditional branches is dependent on the input data. Branch prediction hardware for typical processors may have difficulty correctly predicting the outcome of those conditional branches. Branch misprediction penalties may reduce achievable decompression performance.