Technical Field
The present disclosure relates generally to parallelization in parity de-clustered and sliced disk Redundant Array of Independent Disks (RAID) architecture. More particularly, aspects of this disclosure relate to methods, non-transitory computer readable media, and devices for combining parity groups for uniform load distribution and maximizing parallelization in parity de-clustered and sliced disk RAID architecture.
Description of Related Art
In traditional RAID architecture, the RAID group generally provides (a) fault tolerance against disk failures, ensures that (b) full stripe writes and reads (which involve all disks) utilize all disk spindles and spread the load uniformly in that RAID group.
A parity drive is a hard drive used in RAID technology to provide fault tolerance. Parity is a calculated value which is used for reconstruction of data after a failure. Conventionally, while data is being written to a RAID volume, a calculation for parity is performed by conducting an exclusive OR (XOR) procedure on the data. The calculated parity is then written to the volume. If a portion of the RAID volume fails, the data on the failed portion can be recreated using the parity information and the remainder of the data. In parity de-clustered and sliced disk (PDSD) RAID architecture, however, the above two goals ((a) and (b)) cannot be encompassed within a single entity.
First, fault tolerance is provided by a parity group (PG) which is made up of slices chosen from a subset of disks within the sliced disk group (SDG), a SDG being a collection of disks with similar physical properties. Parity groups may share disks with other parity groups, and thus, cannot become independent and completely parallel entities for inputs and outputs (IOs).
Second, the goal of utilizing all available disk spindles in the system and spreading the load uniformly across the disks can be achieved through the sliced disk group, since it is an independent (does not share the disks with other sliced disk groups) and parallel entity for IOs, provided that uniform use and loading of all the disk spindles within the SDG are insured.
In PDSD RAID architecture, each disk is divided into thousands of slices. Disks with similar properties (like RPM, size, checksum-style, media-type, operating protocol) are grouped to form a SDG. The number of disks in a sliced disk group is typically two to five times the number of disks in a traditional RAID group, achieving better reconstruction throughput. Parity groups are created from slices chosen from a subset of disks within the SDG and their layout is governed by parity de-clustering algorithm. A parity group will not span all disks in a sliced disk group, so IOs across multiple parity groups are required to utilize all disk spindles of the SDG. At any instance, overall disk utilization and load per disk within the sliced disk group depends on the subset of parity groups servicing the IOs. Random selection of parity groups in practice causes uneven disk utilization or uneven load distribution and also does not guarantee spanning all disks of a sliced disk group as disks are shared with multiple parity groups.
Thus, there is a need for a method and an apparatus for combining multiple parity groups under a sliced disk group to achieve uniform use and loading of all the disk spindles within the SDG.