1. Field of the Invention
This invention relates to computer systems and, more particularly, to the management of updates within storage environments employing write-back caches.
2. Description of the Related Art
Many business organizations and governmental entities rely upon applications that access large amounts of data, often exceeding a terabyte or more of data, for mission-critical applications. Often such data is stored on many different storage devices, which may be heterogeneous in nature, including many different types of devices from many different manufacturers. In such storage environments, various storage devices may be configured with associated write-back caches to reduce the latency for update operations as seen by the requesting applications.
A write-back cache may be implemented using memory technology that supports efficient write operations (e.g., various kinds of Random Access Memory or RAM), especially when compared to the latencies associated with committing updates to an underlying storage device such as a disk. In many environments, a non-volatile write-back cache may be employed, e.g., using battery-backed RAM, so that data stored within the cache is recoverable in the event of certain kinds of failures, even if it has not been written to underlying storage. When an application requests that a given block of data within a storage device equipped with a write-back cache be updated, the updated or new version of the data may be stored within the write-back cache. For many kinds of write operations and for various storage devices, an indication that the write operation has completed may be provided to the requesting application as soon as the new version of the data block is written to the write-back cache, instead of waiting until the underlying storage device is updated. Subsequently, for example when the proportion of “dirty” data in the write-back cache (i.e., updated data that has not been committed to the underlying storage device) may be close to a specified threshold level, updated versions of data blocks temporarily stored in the write-back cache may be written to the underlying storage devices.
For some storage devices that support storage operations such as copy-on-write snapshots, however, the previous version of a data block being updated may be needed at a storage processing node other than the data consumer requesting the update, before the previous version is overwritten with the new version. The storage processing node may be any software or hardware entity, such as a physical or logical device, a thread or process, a module, a host or a server, where the previous version of the data block may be needed to perform a particular storage-related function. Storage processing nodes may be linked to the storage device, e.g., over a network, and may include, for example, other storage consumers, other storage devices, or metadata servers such as a volume server in a distributed virtualization environment The previous version of a data block from one node of a distributed storage system may be required, for example, to complete a copy-on-write operation for a snapshot being maintained at another node of the distributed storage system. Similarly, in a distributed virtual RAID device, a previous version of a data block at one node of the device (and the previous version of a corresponding parity block) may be required to compute a new version of the parity block at another node of the device.
In order to prevent an unintended loss of the old version due to an overwrite, the previous version may typically be provided to the storage processing node prior to writing the new version of the data in the write-back cache in such storage environments. In some cases, an acknowledgment may also be required from the storage processing node indicating that the previous version has been received and acted on appropriately (e.g., that the processing that required the previous version of the data block has been completed, or at least that the previous version is no longer required). In many cases, especially when the write-back cache may be relatively small compared with the total amount of data being managed within the storage device, or (as is often the case) the cache prioritizes writes over reads, the previous version of the data block will not be within the cache of the storage device and will have to be read from the underlying backing storage device and delivered to the storage processing node (which may require transmission over a network). The latency of reading a data block from the backing storage device may itself be comparable to the latency of writing a data block to the backing storage device. Any additional processing or delay involved in sending the previous version to the second consumer, and receiving any required acknowledgments, may further extend the time taken to complete the requested update, as seen by the data consumer. Such a synchronous read of a previous version of a data block during an update (i.e., reading a previous version of a data block prior to notifying an updating storage consumer that a requested update has completed) may thus result in a substantial update latency penalty to the updating consumer. For example, the latency of writing into a RAM cache may be replaced by the latency of a read from disk followed by a write to the RAM cache, and potentially additional latencies associated with intervening operations such as reading parity or updating a log. The end-user or application requesting the update may be unaware that a previous version of the data block being updated is required elsewhere, which may add further to the perceived latency penalty. In such environments, it may be desirable to develop a technique that reduces data block update latency as seen by an updating storage consumer.