A virtualized cluster is a cluster of different storage nodes that together expose a single storage device. Input/Output operations (“I/Os”) sent to the cluster are internally re-routed to read and write data to the appropriate locations. In this regard, a virtualized cluster of storage nodes can be considered analogous to collection of disks in a Redundant Array of Inexpensive Disks (“RAID”) configuration, since a virtualized cluster hides the internal details of the cluster's operation from initiators and presents a unified device instead.
In a virtualized cluster, preserving data integrity often requires providing a locking functionality. With locking protection, concurrent operations are guaranteed not to interfere with one another in ways that may corrupt data within the storage system. For example, when a table-based architecture is used to map I/O operations to physical sectors on a disk, the table must be locked prior to modification so as to avoid data loss. Such data loss may occur because of writes to invalid locations, reads from invalid locations, or overwriting new data with old data. Other data loss scenarios are also possible in the absence of locking.
Generally, two kinds of locks are commonly used. These are read locks and write locks. A traditional mechanism of locking uses a lock structure for each entity that may need to be locked. Such an entity may be a file, block, sector, stripe, etc. For example, a locking entity may be each 1 Gigabyte (“GB”) block on a 200 GB disk. In this example, 200 lock structures need to be created and maintained in main memory at all times. Whenever a read lock is requested on a particular gigabyte, the appropriate lock structure is accessed. If there are no outstanding write requests, the read lock is granted and an appropriate variable is incremented in the structure to signal that there is one more outstanding reader on that particular 1 GB storage entity. If there are outstanding writes, the read lock request is queued until all writes have completed.
Similarly, if a write lock is requested on a particular gigabyte, the lock structure is accessed. If there are no outstanding reads or writes, then the write lock is granted and an appropriate variable is incremented in the structure to signal that there is one outstanding write in progress on the 1 GB entity. If there are reads and writes that are outstanding when the write lock is requested, the lock request is placed in a queue, where it will wait for all preceding requests to be completed.
Unfortunately, when using this approach, a lock structure needs to be maintained for every possible entity that may need to be locked, regardless of whether or not there are lock requests on it. This typically places a substantial demand on memory and processing resources. Some implementations attempt to reduce the memory required by the lock system through the use of a collection of lock structures that are accessed through a hash queue or a similar data structure. While this may reduce memory utilization, it can increase complexity and computing requirements. Such an approach can also place a hard limit on the number of outstanding operations that may be performed in a given storage system and thus does not scale well.
It is with respect to these considerations and others that the disclosure made herein is presented.