Flash memory arrays are replacing disk storage devices in many applications due to the more rapid response time to client requests for reading and writing data as well as a capability to perform a much higher number of input/output (I/O) operations per second. However, at present, the hardware cost of the flash memory is greater than that of disk and the flash memory is perceived to have a wear out problem, at least if not properly managed.
The effective storage capacity of the flash memory system may be increased substantially by the use of deduplication and data compression techniques. However each of these techniques involves computational resources and may increase the latency of the storage system in ingesting and acknowledging write operations and in the response to read operations. In addition, such techniques may need to be harmonized with other data center operations such as replication, snapshots, cloning, and the like, including reconfiguration of the storage space allocated to the user based on changing workload characteristics.
User data may be presented in block format or in file format as each of these formats may be extant in the user environment to take advantage of particular user software programs and to support user applications where the data format is chosen for efficiency in processing or data handling by the user.
At present, deduplication is performed either in-line, by post-processing, or during a data backup process. In a multi-user environment, the processing workloads, input and output latencies, and other time-dependent attributes of the workloads may result in inefficient use of the storage system and data management features, as the choice of the process to be performed is deterministic. Typically deduplication is performed by only one of the three mentioned approaches in a particular storage system. This may result in a variable user experience