Data optimization is the act of reducing an amount of data that is stored on a storage device (e.g., a disk) or transmitted across a network without compromising the fidelity and integrity of the original data. Data optimization often involves a combination of techniques for eliminating redundancy in and between persistently stored files. Data de-duplication (dedup) is one such technique in which identical regions (a.k.a. chunks) of data in one or more files are stored as a single region. Compression is another such technique in which data is encoded to include fewer bits (or other information-bearing units) than the original data.
Once data is optimized, the data may be accessed by reversing the effects of the optimization (i.e., de-optimizing the optimized data), for example by performing an inverse dedup operation and/or a decompression operation with respect to the optimized data. However, de-optimization causes a delay with respect to accessing the data. A greater amount of data results in a longer latency. Moreover, such latency may occur each time the data is accessed unless a de-optimized version of the data is stored for access on a storage device. Furthermore, de-optimization often consumes substantial resources (e.g., memory, central processing unit (CPU), disk I/O, etc.) of a device, which may negatively affect a main workload that is running on the device. Accordingly, frequent de-optimization may result in relatively inefficient utilization of the device's resources.
For example, if data in a file is fully optimized, the latency that is associated with accessing the data may unduly degrade the performance of a device that accesses the data and/or a workload that is running on the device, especially if the data is frequently accessed. In another example, it may not be desirable to optimize some regions of a file and/or some types of data. However, the various regions of the file may not be visible to a device that attempts to optimize the file. Accordingly, the device may have no way to know whether the regions of the file are optimizable.