This disclosure relates to data processing and storage, and more specifically, to implementing a non-binary context mixing compressor/decompressor in a data storage system, such as a flash memory system.
NAND flash memory is an electrically programmable and erasable non-volatile memory technology that stores one or more bits of data per memory cell as a charge on the floating gate of a transistor or a similar charge trap structure. In a typical implementation, a NAND flash memory array is organized in blocks (also referred to as “erase blocks”) of physical memory, each of which includes multiple physical pages each in turn containing a multiplicity of memory cells. By virtue of the arrangement of the word and bit lines utilized to access memory cells, flash memory arrays can generally be programmed on a page basis, but are erased on a block basis.
As is known in the art, blocks of NAND flash memory must be erased prior to being programmed with new data. A block of NAND flash memory cells is erased by applying a high positive erase voltage pulse to the p-well bulk area of the selected block and by biasing to ground all of the word lines of the memory cells to be erased. Application of the erase pulse promotes tunneling of electrons off of the floating gates of the memory cells biased to ground to give them a net positive charge and thus transition the voltage thresholds of the memory cells toward the erased state. Each erase pulse is generally followed by an erase verify operation that reads the erase block to determine whether the erase operation was successful, for example, by verifying that less than a threshold number of memory cells in the erase block have been unsuccessfully erased. In general, erase pulses continue to be applied to the erase block until the erase verify operation succeeds or until a predetermined number of erase pulses have been used (i.e., the erase pulse budget is exhausted).
A NAND flash memory cell can be programmed by applying a positive high program voltage to the word line of the memory cell to be programmed and by applying an intermediate pass voltage to the memory cells in the same string in which programming is to be inhibited. Application of the program voltage causes tunneling of electrons onto the floating gate to change its state from an initial erased state to a programmed state having a net negative charge. Following programming, the programmed page is typically read in a read verify operation to ensure that the program operation was successful, for example, by verifying that less than a threshold number of memory cells in the programmed page contain bit errors. In general, program and read verify operations are applied to the page until the read verify operation succeeds or until a predetermined number of programming pulses have been used (i.e., the program pulse budget is exhausted).
PAQ provides a series of lossless data compression archivers that have, through collaborative development, topped rankings on several benchmarks measuring compression ratio (CR). In general, various PAQ versions have implemented a context mixing algorithm. Context mixing is related to prediction by partial matching (PPM) in that the compressor/decompressor is divided into a predictor and an arithmetic encoder/decoder, but differs in that the next-symbol prediction is computed using a weighed combination of probability estimates from a large number of models conditioned on different contexts. Unlike PPM, a context in PAQ does not need to be contiguous.
In general, all PAQ versions, while differing in the details of the models and how the predictions are combined and post-processed, predict and compress one bit at a time. When the next-bit probability is determined, the next-bit is encoded by arithmetic coding. In PAQ1 through PAQ3, each prediction is represented as a pair of bit counts that are combined by weighted summation, with greater weights given to longer contexts. In PAQ4 through PAQ6, the predictions are combined (as in PAQ1 through PAQ3), however, weights assigned to each model are adjusted to favor more accurate models. In PAQ7 and later PAQ versions, each model outputs a probability (rather than a pair of counts) with the model probabilities being combined using a neural network mixer.
Unfortunately, while context mixing compression algorithms top almost all known compression benchmarks, due to the large number of context models implemented, the complexity of neural computation, and their binary nature, context mixing compression algorithms tend to be very slow (for example, the PAQ81 algorithm has a bandwidth around 20 kB/s).