Server computers are used to run instances of host applications such as databases, file servers and block servers, for example and without limitation. Host application data may be maintained for the server computers by a data storage system such as a storage array. The storage array may include a plurality of interconnected computing nodes that manage access to a plurality of drives such as HDDs (Hard Disk Drives) and SSDs (Solid State Drives) on which the host application data is stored. The host applications access host application data by sending IOs to the storage array. A single storage array may maintain host application data for multiple different host applications running on one or more clusters of servers.
Some host application data may be compressed by the computing nodes before being stored on the managed drives. Compression is a way of encoding information to reduce storage requirements, e.g. so as to require fewer bytes of storage space. Typical lossless compression algorithms identify and reduce statistical redundancy in order to encode without information loss. Known lossless data compression algorithms include but are not limited to RLE (run-length encoding), Huffman coding, PPM (prediction by partial matching), and LZxx (various Lempel-Ziv techniques). A data set is typically processed serially in order to perform compression. For example, some compression algorithms recognize recurring patterns in a sequence by using a sliding window to compare a pattern currently in the window with previously found patterns. Such reliance on prior knowledge, i.e. the previously found patterns, tends to hinder implementation of parallelized compression of a data sequence. For example, instances of a lossless data compression algorithm running on parallel processor cores cannot independently process different portions of a sequence in order to compress the entire sequence as a single compressed data set. The sequence can be separated into multiple sub-sequences that are each independently compressed, but the overall compression ratio of the sequence may decrease relative to serial processing.