The new generation of storage arrays are moving from the use of Hard Disk Drive (HDD) storage media to the use of Solid State Drive (SSD) storage media and nonvolatile memory such as Nonvolatile Memory Express (NVMe) flash backend devices. Because SSD devices can be as much as 20 times more expensive than HDD storage media, there is a market incentive to reduce to a minimum the footprint of data on the flash media, that is, the NVM devices.
A reduction in the size of the dataset that is ultimately stored may be achieved by data reduction methods such as data compression or data deduplication. Data compression and data deduplication are fundamentally different processes and, as such, are generally best suited for different respective types of workloads. As well, the extent to which a dataset footprint can be reduced by one or the other of these techniques depends at least in part on the structure of the data sets.
Moreover, while some reduction of the memory footprint may be achieved with compression and deduplication, such techniques are uncorrelated with each other, and their combined implementation would likely result in a lower data footprint reduction and/or poor storage input/output operations per second (IOPS) and higher latency or rate of transfer (RT) performance. This also represents a problem for the customers who expect much higher performance from their much more expensive flash based storage arrays. As such, there is a conflict between the data reduction expectation and the performance of flash based storage arrays according to the higher IOPS performance of flash as compared to HDD moving media.
Thus, conventional dataset footprint reduction scenarios typically involve one or the other of data compression or data deduplication, but not both. As such, conventional approaches to reducing the size of a dataset footprint have proven inadequate at least because they fail to take advantage of the relative strengths of both techniques. Moreover, even if the data compression and data deduplication techniques were to be combined somehow, various problems are likely to occur as a result.