A high speed cache is an indispensable part of any storage system. A cache serves multiple purposes, including collecting multiple and duplicate writes together before eventually flushing them to disk, increasing disk input/output (“I/O”) bandwidth utilization, facilitating seek optimization through request sorting, and satisfying read requests from memory without dispatching disk I/O operations.
Caching is offered by many different operating systems. For instance, the LINUX operating system uses a disk cache in its block device and file system architecture. However, the implementation utilized in the LINUX operating system, and many other operating systems, has several limitations in the framework of a specialized storage stack. The main limitation is primarily the result of the utilization of a page cache, in which the unit of caching is a 4 kB page. The manner in which such caches are architected cause a sub-page write request to first read the corresponding 4 kB page, make the requested modification, and later, to flush the entire 4 KB to disk (a read-modify-write cycle). Within a page, multiple sector writes may be collected before the final flush is performed. While the use of a page cache may improve CPU performance, it occasionally introduces a performance penalty rather than a boost because a read-modify-write cycle is forcibly performed.
Traditional disk caching also does little to improve the performance of data storage systems with advanced features such as snapshots. A snapshot is a read-only volume that is a point-in-time image of a data storage volume that can be created, mounted, deleted, and rolled back onto the data storage volume arbitrarily. Snapshots are utilized extensively in the data storage industry for security, backup, and archival purposes. Snapshots may also be utilized within data storage systems that utilize thin provisioning to allocate storage space on demand. If a system has active snapshots, a new write access that has a granularity that is less than the snapshot chunk granularity must necessarily result in a read-modify-write cycle, thereby greatly reducing system performance.
Caching must also be carefully utilized in conjunction with redundant array of inexpensive disk (“RAID”) configurations. This is because coalescing of write operations by a cache before flushing may result in a flush write that is more than one RAID stripe long, resulting in multiple RAID accesses. This also can seriously degrade system performance.
It is with respect to these considerations and others that the present invention has been made.