Shared storage systems involve a shared data storage and multiple computer systems which access the shared data storage, for example servers and/or computers (collectively called clients). Large scale storage systems operate with hundreds and/or thousands of clients, which share access to a shared data store comprising many storage arrays, wherein the shared data store is capable of storing many terabytes of data. Traditionally, a storage array is used to store large amounts of data on a large group of storage media (e.g. hard disk drives (HDD), solid state drives (SSD), and the like).
In a shared storage system, because multiple clients (e.g. servers) share the same data, circumstances arise wherein two different servers attempt to access the same data during the same unit of time. Such a circumstance leads to write contentions, wherein one server may overwrite data that another server is using. Such an overwrite causes data inconsistencies and should be avoided. As such, in order to prevent write contentions, traditional systems employ client based write locking methods.
Traditionally, shared storage systems implement a distributed, server-side write locking scheme, wherein the servers are the above described clients. In server-side write locking, a server may access a grain, which is a set of data records. The granularity describes the size of the grain, and the granularity of the name space may be static or dynamically sized. When a server accesses the grain, the accessing server locks the grain by communicating its use of the grain with the other servers in the cluster. Then, when the accessing server is finished with the grain, the accessing server unlocks the grain by communicating its release of the grain's write lock with the other servers in the cluster. For example, if accessing server accesses grain A, accessing server communicates with all other servers in the system to inform the other servers that grain A is write locked, and thus, the other servers are not allowed access grain A. As a result, all the other servers often have logic disallowing access to grain A, and all servers are often capable of receiving and processing high speed signaling from the accessing server to provide notice that grain A is available for use again. Likewise, the same steps are followed by a different accessing server using a different grain (e.g. grain N). As such, the above described write locking method is distributed among the servers in the cluster and the write locking is handled on the server-side.
Traditionally, distributed server-side write locking presents several problems. For example, each server is responsible for locking the data it accesses, which means that all of the servers must constantly communicate with each other in order to keep the write locks up to date. As a result, if any of the servers become noncommunicative for any amount of time and/or for any reason the distributed server-side write locking method becomes unreliable.
For example, if server A access grain A and sends a communication to the other servers that grain A is write locked, then server A will believe that the grain is properly locked and begin operations using the data in grain A. However, if server B does not receive the write lock communication for any number of reasons (e.g. communications malfunction, data corruption, temporary loss of power, etc.), server B may not be aware of the write lock. As such, server B may unwittingly access grain A and overwrite the data therein while server A is performing operations on the same data. As a result, due to a communication error between server A and server B, a write contention may occur.
In another example of traditional methods, server A access grain A and sends a communication to the other servers that grain A is write locked. As a result, the other servers will not access grain A because they believe it to be write locked. Then, while grain A is write locked, server A goes offline for any number of reasons (e.g. loss of power, data corruption, communications malfunction, etc.). As a result, server A may be unable to communicate a release to unlock grain A for minutes, hours, days, etc., thereby causing grain A to be unnecessarily inaccessible to the other servers for an unacceptable amount of time. Because distributed server-side write locking depends on the operability of so many different servers and their communication paths, distributes server-side write locking is vulnerable to a large number of malfunctions originating from a large number of sources.
In order to decrease the number of sources which may cause a write locking breakdown, other traditional methods centralize the write locking operations to a dedicated server. A single server-side write locking system, sometimes called a metadata manager system, controls write locking with a dedicated server; thus, each server request for a grain (e.g. grain A) is funneled through the dedicated write locking server and if the write locking server identifies grain A as unlocked, then the write locking server allows access to the grain A, which is stored on a remotely located storage array. Likewise, if the write locking server determines that grain A is locked, then access to grain A is denied.
While this traditional method may minimize the number of vulnerable nodes within the write locking method, the dedicated server-side scheme is limited in scalability. For example, a dedicated server is limited in the number of requests it can process at any one time. As such, the more data requesting servers which are added to the cluster and need access to the shared data store, the more bottlenecks occur at the dedicated write locking server. Eventually, as more and more data accessing servers and more and more storage arrays are added to the cluster, the dedicated server will be unable to service all the data requests for the grains in the cluster open for write access and the write locking method will become unable to keep up with the volume of write locking requests and cluster operations will break down. As such, single server-side write locking systems are limited in scalability to about a dozen data accessing servers and a couple storage arrays.
In response to the bottleneck problem described above, alternative traditional approaches have extended the dedicated server-side write locking method to multiple dedicated servers. However, as the number of dedicated write locking servers grow, the number of vulnerable points within the system grows as well, as described in the distributed server-side locking method above. As such, if any one of the dedicated write locking server goes offline for any reason or any one of the communications between the dedicated write locking servers is lost for any reason, then the write locking method breaks down as described above in the distributed server-side locking method.