1. Technical Field
This disclosure relates generally to data storage systems and, more specifically, to cache management involving shared (e.g., deduplicated) data blocks.
2. Description of the Related Art
Stored data may be organized into data blocks, which in turn may be organized into files. Files in a file system may have no overlap, partial overlap, or total overlap. For example, some files have completely different contents from one another. On the other end of the spectrum, a first file may be a copy of a second file but have a different name (these two files have total overlap of contents). Although the first and second file may have different identities (e.g., /usr/file1 and /temp/file2), the contents referred to are ultimately the same. Files may also have partial overlap in contents.
A file system (or operating system) running on a computer may receive requests to access files. The file system or operating system may then access a storage device containing data for the files. The storage device will necessarily have a limited capacity to handle requests.
A first request for a data block that is shared may cause a request to be sent from a computer system to a storage device. Subsequently, another request to access that same shared data block (perhaps through a different file name) might also trigger a request to be sent from the computer system to the storage device. When a shared data block is requested multiple times (e.g., under different identities), multiple requests to storage devices may result. In some systems, thousands or even millions of requests for the same data being may be sent to a storage device within a relatively short time period, thus burdening the capacity of the storage device.