In a cache or storage system, a memory index may map from some logical address to a physical location indicating where blocks of data reside in the cache or storage system. This index may require a record (index entry) in memory for each block stored. When compression is used to reduce the storage space occupied by blocks of data, the number of blocks which can be stored increases, and thus also increases memory requirements for indexing, because each cached block is referenced from the in-memory index. Using larger blocks reduces the memory requirements (fewer items to index), but if the client requests to read smaller data units (i.e., much smaller than the indexed larger blocks), the entire large block must be read into memory and decompressed, which lowers I/O's per second (IOPS) and increases latency. As an example, consider indexing large blocks, such as 32 KB, to reduce index memory requirements, while the client may want to access random 4 KB sub-blocks. In such a scenario, the full 32 KB block needs to be read and decompressed for each random 4 KB read, thus reducing potential read performance.
An approach when using 32 KB blocks is to compress the entire 32 KB block (perhaps down to ˜16 KB), insert the block into a cache or storage, and add an entry to the memory-resident index representing the cached/stored block. Due to the nature of how data is compressed, usually it is not possible to start reading in the middle of compressed data. In other words, bytes in the middle of a compressed block are not randomly accessible. For that reason, when a client attempts to read a 4 KB sub-block contained within an indexed and compressed 32 KB stored data block, the entire compressed 32 KB block (˜16 KB after compression) needs to be read and decompressed in order to identify and return the requested 4 KB sub-block. A Solid State Device (SSD) interface often supports reads at 4 KB granularity, so reading the compressed 32 KB block (˜16 KB after compression) requires about 4 read operations. In general, when indexing a stored data block that is larger than the size of a single data read operation by the storage interface, multiple read operations are necessary (even if the requested data size could be returned in a single SSD read). Having to perform multiple reads (e.g., 4 reads of 4 KB each) to retrieve much smaller requested data (4 KB) results in longer latency and correspondingly fewer IOPS. A similar problem exists in hard drive systems if compressed data is larger than a rotational track, which is kilobytes in size.