Compression programs routinely limit the data to be compressed together in segments called windows. The process of doing this is called windowing. String-based compression techniques such as Lempel-Ziv or Burrows-Wheeler often use fixed-size windows suitable for in-core processing. Entropy-encoding techniques such as Huffman or arithmetic compression normally do not require windowing except to bound code lengths or to avoid reading large files multiple times. However, these compressors can benefit from windowing when the statistical models change in different file regions. For example, consider a data file made up from four letters in which two letters appear exclusively in the first half of the file while the other two letters appear exclusively in the second half. If all letters appear with the same frequency, a Huffman compressor would normally encode each letter with two bits. On the other hand, each letter can be encoded with a single bit if each half of the file is treated separately. Adaptive techniques such as adaptive Huffman or splay tree do encode data with shifting models but they often produce inferior codes and incur larger costs in both compression and uncompression times than static Huffman.
Therefore, a need exists for a method for efficient window partition identification in entropy encoding, e.g., with performance much better than O(s3) time.