1. Field of the Invention
This invention relates to distributed storage environments in general and, more particularly, to a method and apparatus for controlling cached write operations to storage arrays.
2. Description of the Related Art
Modern distributed shared storage environments may include multiple storage objects shared via one or more interconnection networks. The interconnection networks provide the infrastructure for accessing the various elements of a distributed shared storage environment. Within the storage environment, file system abstractions may be built on top of multiple storage objects. Additional software layers may provide for the creation and management of logical volumes on storage objects within the context of a storage file system. The distribution and sharing system for the storage objects may be referred to as a storage area network (SAN), which may include its own networking fabric and management systems.
The storage objects in a SAN may be physical disks assembled into storage arrays and may be configured to distribute data across multiple storage devices. Storage arrays may further be equipped with a persistent cache, which can treat a write operation as being completed when the data associated with the write operation (intended to be written to the storage array) has been written to the cache. The persistent cache may then independently flush the write operation to the storage array. By reducing the latency of the write operation at the storage array, the persistent cache may provide for increased overall performance (i.e., data throughput).
Furthermore, additional caches or buffers may additionally be installed at multiple points in the storage environment between an application issuing write operations and the target physical storage arrays. For example, a server that includes the application issuing the write command may also include a server cache that can be configured for buffering write operations (e.g., within the file system layer).
When each of the multiple caches (or buffers) in a distributed storage environment are neither synchronized nor coordinated, such a configuration may result in a stochastic distribution of write operations between the caches, such that the overall data throughput may be poorly distributed in time or in the available memory capacity. Thus, a string of independent buffers or caches may not necessarily represent an efficient method of load balancing among the various components in the storage environment. For example, a flood of write operations at a particular moment could cause bottlenecks and even seizures (i.e., a storage array shutting down and not accepting any further I/O operations for a period of time), despite the fact that the storage system could have theoretically provided enough overall bandwidth to handle the load.