With the accelerating growth of Internet and intranet communication, high-bandwidth applications (such as streaming video), and large information databases, the need for networked storage systems has increased dramatically. System performance, data protection, and cost have been some of the main concerns in designing networked storage systems. In the past, many systems have used fibre channel drives, because of their speed and reliability. However, fibre channel drives are very costly. Integrated drive electronics (IDE) drives are much cheaper in terms of dollars-per-gigabyte of storage; however, their reliability is inferior to that of fibre channel drives. Furthermore, IDE drives require cumbersome 40-pin cable connections and are not easily replaceable when a drive fails. Serial advanced technology attachment (SATA) drives that use the same receptor as their fibre channel counterparts are now available. These drives, therefore, have the speed required for acceptable system performance and are hot-swappable, which means that failed SATA drives are easily replaced with new ones. Furthermore, they provide more storage than do fibre channel drives and at a much lower cost. However, SATA drives still do not offer the same reliability as fibre channel drives. Thus, there is an industry push to develop high-capacity storage devices that are low cost and extremely reliable.
To improve data reliability, many computer systems implement a redundant array of independent disks (RAID) system, which is a disk system that includes a collection of multiple disk drives that are organized into a disk array and managed by a common array controller. The array controller presents the array to the user as one or more virtual disks. Disk arrays are the framework to which RAID functionality is added in functional levels in order to produce cost-effective, highly available, high-performance disk systems.
In RAID systems, the data is distributed over multiple disk drives to allow parallel operation, and thereby enhance disk access performance and provide fault tolerance against drive failures. Currently, a variety of RAID levels from RAID level 0 through RAID level 6 have been specified in the industry. RAID levels 1 through 5 provide a single-drive fault tolerance. That is, these RAID levels allow reconstruction of the original data if any one of the disk drives fails. It is quite possible, however, that more than one SATA drive may fail in a RAID system. For example, dual-drive failures are becoming more common as RAID systems incorporate an increasing number of less expensive disk drives.
To provide, in part, a dual-fault tolerance to such failures, the industry has specified a RAID level 6. The RAID 6 architecture is similar to RAID 5, but RAID 6 can overcome the failure of any two disk drives by using an additional parity block for each row (for a storage loss of 2/N, where N represents the number of networked drives). The first parity block (P) is calculated by the performance of an exclusive or (XOR) operation on a set of positionally assigned data sectors (i.e., rows of data sectors). Likewise, the second parity block (Q) is generated by the use of the XOR function on a set of positionally assigned data sectors (i.e., columns of data sectors). When a pair of disk drives fails, the conventional dual-fault tolerant RAID systems reconstruct the data of the failed drives by using the parity sets. RAID systems are well known in the art and are amply described, for example, in The RAIDbook, 6th Edition: A Storage System Technology Handbook, edited by Paul Massiglia (1997), which is incorporated herein by reference.
An examplary dual-parity algorithm is found in U.S. Pat. No. 6,453,428, entitled, “Dual-drive Fault Tolerant Method and System for Assigning Data Chunks to Column Parity Sets.” The '428 patent describes a method and system for assigning data chunks to column parity sets in a dual-drive fault tolerant storage disk drive system that has N disk drives, where N is a prime number. Each of the N disk drives is organized into N chunks, such that the N disk drives are configured as one or more N×N array of chunks. The array has chunks arranged in N rows from row 1 to row N and in N columns from column 1 to column N. Each row includes a plurality of data chunks for storing data, a column parity chunk for storing a column parity set, and a row parity chunk for storing a row parity set. These data chunks are assigned in a predetermined order. The data chunks in each row are assigned to the row parity set. Each column parity set is associated with a set of data chunks in the array, wherein row m is associated with column parity set Qm, where m is an integer that ranges from 1 to N. For row 1 of a selected N×N array, a first data chunk is assigned to a column parity set Qi, wherein i is an integer determined by rounding down (N/2). For each of the remaining data chunks in row 1, each data chunk is assigned to a column parity set Qj, wherein j is an integer one less than the column parity set for the preceding data chunk and wherein j wraps to N when j is equal to 0. For each of the remaining rows 2 to N of the selected array, a first logical data chunk is assigned to a column parity set Qk, wherein k is one greater than the column parity set for the first logical data chunk in a preceding row and wherein k wraps to 1 when k is equal to (N+1). For each of the remaining data chunks in rows 2 to N, each data chunk is assigned to a column parity set Qn, wherein n is an integer one less than a column parity set for the preceding data chunk and wherein n wraps to N when n is equal to 0.
The algorithm described in the '428 patent safeguards against the loss of data in the event of a dual-drive failure. However, performance of the algorithm described uses excess XOR processing cycles, known as XOR bandwidth that may otherwise be utilized for performing system storage tasks. Hence, the '428 patent describes a suitable dual-parity algorithm for calculating dual-parity and for restoring data from a dual-drive failure, yet it fails to provide an optimized software system that is capable of performing the dual-parity algorithm without affecting system performance.
Furthermore, the algorithm described in the '428 patent is dependent on row and column parity, which requires a prime number of data drives to be present in the system. The requirement of a specific number of drives limits system design flexibility, which can lead to increased cost (e.g., a system that requires fourteen drives to meet storage needs would need seventeen drives, the next prime number of drives, in order to meet parity algorithm requirements). In some cases, phantom drives are used to fill in the missing number of drives in order to achieve a prime number of drives. These phantom drives are assumed to contain data equal to logical ‘0’ and are used during XOR calculations to create parity or restore lost data. This method uses excess system overhead and processing cycles on the phantom data, which lowers overall system performance.
There is, therefore, a need for an effective means of calculating parity, such that the storage system is fault tolerant against a dual-drive failure, provides optimal system performance by optimizing XOR bandwidth, is capable of generating parity regardless of symbol position (i.e., not dependent on row, diagonal/column parity), and is not dependant on a prime number of drives, phantom drives, or phantom data.
In short, there is a need for an algorithm that compensates for dual-storage element failures in a networked storage system. There is a need for a dual parity calculating algorithm that is not dependent on symbol position, a prime number of drives, phantom drives, or phantom data. And, there is a need for a dual parity calculating algorithm that either runs once a priori or that may be used in real time, without adversely affecting system performance.