1. Field of the Invention
This invention relates to data storage and more particularly relates to data storage using a front-end, distributed redundant array of independent drives (“RAID”).
2. Description of the Related Art
Traditional RAID systems are configured with a RAID controller that functions to receive data, calculate striping patterns for the data, divide the data into data segments, calculate a parity stripe, store the data on storage devices, update the data segments, etc. While some RAID controllers allow some functions to be distributed, the storage devices managed by the RAID controller do not communicate with clients directly for storing data striped in a RAID. Instead storage requests and data for RAIDing pass through the storage controller.
Requiring the RAID controller to touch all of the data to be stored in a RAID is inefficient because it creates a dataflow bottleneck. This is especially true during a read-modify-write process where bandwidth and performance of all of the drives in the RAID group is consumed while only a subset is actually updated. In addition, a region of the storage device designated for data managed by the RAID controller is typically dedicated to the RAID group and cannot be accessed independently. Access to a storage device by a client must typically be accomplished by partitioning the storage device. Where partitioning is used, partitions accessible for general storage are not used for RAID and partitions allocated to the RAID group are not accessible for general data storage. Schemes that oversubscribe partitions in order to globally optimize utilization are complex and more difficult to manage. In addition, storage space allocated for one RAID group cannot be accessed by more than one RAID controller unless one is designated as master and other RAID controllers act as slaves unless the master RAID controller is inactive, non-functional, etc.
Typical RAID controllers also generate parity data segments outside of the storage devices of the RAID group. This can be inefficient because parity data segments are typically generated and then are sent to a storage device for storage, which requires computing capacity of the RAID controller. Tracking parity data segment location and updates must also be done at the RAID controller instead of autonomously at a storage device.
Where it is necessary to ensure that the data remains available if the separate RAID controller is offline, RAID controllers are typically cross connected to the drives and to each other, and/or mirrored as complete sets, making data availability expensive and difficult to manage, and dramatically reducing the reliability of the storage subsystem.