1. Field of the Invention
This invention relates to computer systems and, more particularly, to data storage subsystems within computer systems.
2. Description of the Related Art
Computer systems frequently include data storage subsystems for storing data. In particular, computer systems that include multiple clients interconnected by a network increasingly share one or more data storage subsystems via a network. The data storage subsystems may include or be further coupled to storage consisting of one or more disk storage devices, tape drives, or other storage media. A computer system may also include one or more servers in which metadata describing the contents of the included storage devices is maintained.
Data storage subsystems may store data with some redundancy to allow for recovery from storage errors. There are a variety of techniques to store data redundantly, including erasure coding techniques such as Reed-Solomon encodings and RAID (Redundant Array of Independent Disks) using a variety of layouts, such as RAID-1, RAID-5, or RAID-6. These RAID layouts may be implemented within an object-based file system in which each independent storage device is treated as a disk. Each client device may convey data to the storage devices via a network. Unfortunately, some way of arbitrating write access requests from multiple clients may be needed to avoid introducing inconsistencies into the redundant data. One arbitration approach is to require each client to obtain a lock before accessing a storage location. However this approach requires that each client be responsible for and trusted to perform all of the functions involved in sequencing writes using the lock mechanism. For example, in the case of RAID-5 or RAID-6, these functions may include reading old data and old parity, computing new parity, logging the new data and new parity, and writing the new data and new parity to their respective storage locations that together constitute a part of or the whole of a row in the RAID layout. In addition, a client may be required to retrieve information from the Meta Data Server (MDS) for each write to an individual location in the RAID layout. The performance of these functions increases write latency and adds complexity and significant computational and storage overhead to each client.
In addition to the above considerations, data storage subsystems are designed to minimize the loss of data that may occur when one or more devices fail. Although RAID layouts are intended to provide high availability and fault tolerance, there may be periods of increased vulnerability to device failure during complex write operations if clients are responsible for maintaining the redundancy. In view of the above, a more effective system and method for managing writes to data storage subsystems that accounts for these issues are desired.