Incorporated by reference herein are Appendices A, B and C, which are submitted on a compact disc and contain computer program listings. The compact disc contains the following files:
Name of file: ApndxA.txt; date of creation: Nov. 4, 2002; size: 13 Kbytes;
Name of file: ApndxB.txt; date of creation: Nov. 15, 2002; size: 18 Kbytes; and
Name of file: ApndxC.txt; date of creation: Nov. 18, 2002; size: 22 Kbytes.xe2x80x9d
1. Field of the Invention
The present invention relates generally to data redundancy methods and apparatus. Various aspects relate more particularly to redundancy data generation, data restoration, data storage, redundancy adjustability, data communication, computer network operations, and code discovery techniques.
2. Description of the Related Art
With the explosive growth in the Internet and mission-critical applications, the importance of preserving data integrity and ensuring 24xc3x977 continuous access to critical information cannot be overstated. Information is now recognized as a key organizational asset, essential to its operation and market competitiveness. Access to critical information on a continuous basis is a mandatory requirement for survival in the business world. Critical applications involving military operations, communications, audio-visual, medical diagnoses, ISP (Internet Service Provider) and Web sites, or financial activities, for example, depend upon the continuous availability of essential data.
Downtime is extremely costly. Customers, vendors, employees, and prospects can no longer conduct essential business or critical operations. There is a xe2x80x9clost opportunityxe2x80x9d cost to storage failures as well in terms of business lost to competitors. Well-documented studies place the cost of downtime in the tens of thousands (or even millions) of dollars per hour.
The need for large amounts of reliable online storage is fueling demand for fault-tolerant technology. According to International Data Corporation, the 45 market for disk storage systems last year grew by 12 percent, topping $27 billion. More telling than that figure, however, is the growth in capacity being shipped, which grew 103 percent in 1998. Much of this explosive growth can be attributed to the space-eating demands of endeavors such as year 2000 testing, installation of data-heavy enterprise resource planning applications and the deployment of widespread Internet access.
Disk drive manufacturers publish Mean Time Between Failure (MTBF) figures as high as 800,000 hours (91 years). However, the claims are mostly unrealistic when examined. The actual practical life of a disk drive is 5 to 7 years of continuous use. Many Information Technology managers are aware that disk drives fail with great frequency. This is the most likely reason why companies place emphasis on periodic storage backup, and why there is such a large market for tape systems.
The industry answer to help satisfy these needs has been the use of conventional RAID (xe2x80x9cRedundant Arrays of Inexpensive Disksxe2x80x9d) storage. In general, RAID storage reduces the risk of data loss by either replicating critical information on separate disk drives, or spreading it over several drives with a means of reconstructing information if a single drive is lost.
There are basically four elements of RAID: 1) mirroring data (i.e., creating an exact copy every time information is written to storage), 2) performing checksum calculations (parity data), 3) striping information in equal-sized pieces across multiple drives, and 4) having a standby hot spare should one drive fail. Some methods use a combination of both approaches. RAID storage systems are usually designed with redundant power supplies and the ability to swap out failed drives, power supplies and fans while the system continues to operate. Sophisticated RAID systems even contain redundant controllers to share the workload and provide automatic fail-over capabilities should one malfunction.
Conventional RAID storage configurations have proven to be the best hedge against the possibility of a single drive failure within an array. If more than one drive in a RAID array fails, however, or a service person accidentally removes the wrong drive when attempting to re place a failed drive, the entire RAID storage system becomes inoperable. And the likelihood of multiple drive failures in large disk arrays is significant. The resultant cost of inaccessibility to mission-critical information can be devastating in terms of lost opportunity, lost productivity and lost customers.
Accidents can contribute to multiple drive failures in RAID storage. Service personnel have been known to remove the wrong drive during a replacement operation, crashing an entire RAID storage system. In poorly engineered RAID systems, replacing a failed drive can sometimes create a power glitch, damaging other drives. General data center administrative and service operations also present opportunities for personnel to inadvertently disable a drive.
It is well-known that the likelihood of a drive failure increases as more drives are added to a disk RAID storage system. The larger the RAID storage system (i.e., the more disk drives it has) the greater the chance that two or more drives could become inoperable at one time. Here, the term xe2x80x9ctimexe2x80x9d means the duration from the instant when a drive fails until it is replaced and data parity information is recovered. In remote locations, during holidays, or even during graveyard shifts, the xe2x80x9ctimexe2x80x9d to drive recovery could be several hours. Thus, multiple drive failures do not have to occur at exactly the same instant in order to have a devastating effect on mission-critical storage.
Given the plausible assumptions that drives fail independently at random times with a certain MTBF, and that they stay down a certain time after failing, the following conclusions may be drawn for large arrays of disks: (1) the frequency of single drive failure increases linearly as the number of disks n; (2) the frequency of two drives failing together (a second failing before the first is reconstructed) increases as n*(nxe2x88x921), or almost as the square of the number of disks; (3) the frequency of three drives failing together increases as n(nxe2x88x921)(nxe2x88x922) or almost as the cube; and so forth.
The multiple failures, though still less frequent than single disk failure, become rapidly more important as the number of disks in a RAID becomes large. The following table illustrates the behavior of one, two and three drive failure MTBFs given that single drive MTBF divided by downtime is very much greater than the number of drives:
Here a  less than  less than b less than  less than c are mean time constants for a failure of one disk, a coincidental failure of two disks, and a coincidental failure of three disks, respectively. If one-disk MTBF is five 360-day years and downtime is one day, then a=5 years, b=4,500 years, and c=5,400,000 years. If MTBF is reduced to 1 year and downtime increased to two days, then a=1 year, b=90 years, and c=10,800 years.
The consequences of a multiple-drive failure can be devastating.
Typically, if more than one drive fails, or a service person accidentally removes the wrong drive when attempting to replace a failed drive, the entire RAID storage system is out of commission. Access to critical information is not possible until the RAID system is re-configured, tested and a backup copy restored. Transactions and information written since the last backup may be lost forever.
Thus, the possibility of a multiple-drive failure is very high for mission-critical applications that run 24-hours daily on a continuous basis. Moreover, the larger a RAID storage system, the greater the potential of suffering multiple-drive failures. And the chances increase significantly for remote locations where the response time to replace a failed drive can extend to several hours or even days.
Conventional RAID levels have their advantages and disadvantages.
While RAID-0 delivers high performance, it cannot sustain even a single drive failure because there is no parity information or data redundancy. Although the most costly, mirroring data on separate drives (RAID-1), means that if one drive fails, critical information can still be accessed from the mirrored drive. Typically, RAID-1 involves replicating all data on two separate xe2x80x9cstacksxe2x80x9d of disk drives on separate. SCSI channels, incurring the cost of twice as many disk drives. There is a performance impact as well, since data must be written twice, consuming both RAID system and possibly server resources. RAID-3 and RAID-5 allow continued (albeit degraded) operation by reconstructing lost information xe2x80x9con the flyxe2x80x9d through parity checksum calculations. Adding a global hot spare provides the ability to perform a background rebuild of lost data.
With the exception of costly RAID-1 (or combinations of RAID-1 with RAID-0 or RAID-5) configurations, there have been few solutions for recovering from a multiple drive failure within a RAID storage system. Even the exceptions sustain multiple drive failures only under very limited circumstances. For example, a RAID-1 configuration can lose multiple (or all) drives in one mirrored stack as long as not more than one drive fails in its mirrored partner. Combining striping and parity within mirrored stacks buys some additional capabilities, but is still subject to these drive-failure limitations.
Some variations of RAID are based merely on combinations of RAID levels, described below in terms of basic structure and performance (0+1 array, 5+1 array, and 5+5 array). All of the packs described in the following. configurations are assumed to have pattern designs that maximize read and write speed for large files and parallel data flows to and from disks. The xe2x80x9cidealxe2x80x9d speeds will be based on raw data movement only, ignoring buffering and computational burdens: In a xe2x80x9c0+1xe2x80x9d array, two striped arrays of five disks each mirror the other. A striped array (RAID-0) is lost if only one of its disks is lost, so the safe loss count=1 and maximum loss count=5 (depending whether disks lost are on same side of mirror). Data capacity=5, read speed=10 (using an operating system capable of alternating mirror reads to achieve full parallelism; the usual max is 5) and write speed=5 (here reading assumes a strategy of alternating between sides of the mirror to increase the parallelism). In a xe2x80x9c5+1xe2x80x9d array, two RAID-5 arrays of five disks each mirror each other. Safe loss count is 3 (when one side has lost no more than one disk, the other perhaps more, we can still recover), max loss count is 6 (one entire side, and one disk from the other side). Data capacity is 4 (equals that of one RAID-5 array), read speed=10 but usual max is 5 (see above discussion of xe2x80x9c0+1xe2x80x9d), and write speed=4 (using full parallelism, but with parity and mirror burdens). Similar results arise from a 1+5 array (a RAID-5 made of mirrored pairs). In at xe2x80x9c5+5xe2x80x9d array, three RAID-5 arrays of three disks each form a RAID-5 with respect to each other. Thus one entire array of three can be lost, plus one disk of each of the other two. This implies safe loss count=3 (it can""t tolerate a 0xe2x88x922xe2x88x922 loss pattern) and, max loss count=5. Data capacity is 4 (of 9), read speed is 9 (using nested striping) and write speed is 4.
Other RAID-like variations exist, but with their downsides. A highly complex encryption-type multiple redundancy algorithm exists, referred to as the Mariani algorithm (downloaded file xe2x80x9craidzzxe2x80x9d and related files). The form of RAID described by Mariani can either be applied to dedicated parity disks or have rotation superimposed (as with the two patents referred to below), and additionally requires encryption; which does not treat every bit in a chunk identically. In addition, the subject matter in U.S. Pat. No. 5,271,012 (xe2x80x9cMethod and Means for Encoding and Rebuilding Data Contents of up to Two Unavailable DASDs in an Array of DASDSxe2x80x9d and in U.S. Pat. No 5,333,143 (xe2x80x9cMethod and Means for B-Adjacent Coding and Rebuilding Data from up to Two Unavailable DASDs in a DASD Arrayxe2x80x9d) address multiple failures, but are limited. The form of RAID described in these patents generates two parity stripes as a function of nxe2x88x922 data stripes. The two parity stripes (on two of the disks) are all parity; the nxe2x88x922 data stripes (on nxe2x88x922 of the disks) are all data. This leads to read inefficiency unless a rotation structure is superimposed on the formula, in which case it leads to algorithmic inefficiency.
Accordingly, what is needed are methods and apparatus that overcome these and other deficiencies of the prior art.
A data storage apparatus has a plurality of n disks and data comprising a plurality of n data groupings stored across the plurality of n disks. Each one of the n data groupings comprise a data portion and a redundancy portion. Advantageously, the n data portions are recoverable from any and all combinations of nxe2x88x92m data grouping(s) on nxe2x88x92m disk(s) when the other m data grouping(s) are unavailable, where 1xe2x89xa6m less than n.