Some computer systems are dedicated storage systems. These storage systems typically include one or more arrays of rotating magnetic disks for secondary, non-volatile storage of data. Typically, a storage system may include an enclosure, power supply, cooling fans, and disk array controller(s).
These disk arrays are sometimes colloquially called “Just a Bunch Of Disks” or alternatively “Just a Box Of Disks” (JBOD). A JBOD is an array of disk drives that are specially designed to improve control, performance, and fault tolerance of such a disk-array storage system.
FIG. 1 shows an example of a conventional computer network 10 having a central computer 20 for controlling the system and for central processing. Of course, such a central computer (i.e., server) may be composed of many interconnected computers. In addition to other functions, the central computer 20 controls and monitors multiple storage systems, such as storage system 30, storage system 32, and storage system 34.
FIG. 2 shows a block diagram of the basic components of a typical storage system, in particular, storage system 30. It includes an input/output (I/O) unit 52 for sending/receiving data and control information to/from the central computer 20, other storage systems, and other network devices. A disk array controller 54 is coupled to the I/O unit 52 and to a disk array 60. One or more data and control lines connect the disk array controller 54 to the disk array 60. Of course, a storage system may include multiple controllers and multiple disk arrays.
In a conventional storage system (like the one illustrated in FIG. 2), all disk drives in an array are powered by a power supply. Typically, the supply of power to the drive is not controllable or selectable.
Redundant Array of Independent Disks (RAID)
A common high-availability storage solution is a Redundant Array of Independent (or Inexpensive) Disks (RAID). RAID is a high-availability storage system that employs two or more drives in combination.
RAID was designed as a means for improving storage subsystem capacity. However, there was a problem with this implementation. The resulting “mean time before failure” (MTBF) of the array was actually reduced due to the probability of any one drive of the array failing. As a result of this finding, the RAID developers proposed multiple levels of RAID to provide a balance of performance and data protection.
Conventionally, RAID schemes are classified into five basic levels (although other levels may exist):                a first level in which the same data are stored on two disks (“mirrored” disks);        a second level in which data are bit-interleaved across a group of disks, including check disks on which redundant bits are stored using a Hamming code;        a third level in which each group has only a single check disk (sometimes called a “parity” disk), on which parity bits are stored;        a fourth level that uses block interleaving and a single check disk per group; and        a fifth level that uses block interleaving and distributes the parity information evenly over all disks in a group, so that the writing of parity information is not concentrated on a single check disk.        
For all RAID levels, fault-tolerant arrays often have an additional disk in the array. This is the “check” disk. This disk acts as the replacement disk when one disk in the array fails. The data on a failed disk are conventionally reconstructed; then the reconstructed data are written onto the replacement disk. This places the replacement disk in exactly the same state as the failed disk.
MTBF
MTBF is short for “mean time between failures” or “mean time before failure.” Typically, MTBF ratings are measured in hours and indicate the sturdiness of hard disk drives, printers, and virtually any other component.
Typical inexpensive disk drives for personal computers have MTBF ratings of about 300,000 hours. This means that of all the drives tested, one failure occurred every 300,000 hours of testing. However, this measure is only a statistical model based upon test drives and estimated operation time of failed drives returned to the factory.
The theoretical MTBF of a disk drive represents the steady state failure rate of a large population of drives in volume manufacture. This is the expected time after the initial burn-in phase that it will take a hardware component to fail due to normal wear and tear.
Calculating Theoretical MTBFs. Most discussions of a computer's MTBF focus on its disk drives' MTBFs for several reasons. Primarily, components with moving parts (such as disk drive actuators and motors) typically have significantly lower MTBFs than non-moving components (such as memory chips or main CPU boards). Because a computer's theoretical MTBF is most influenced by the MTBF of the least reliable component as well as the sheer number of components, disk drive MTBFs typically dominate the overall computer system's theoretical MTBF.
Theoretical MTBF of a computer decreases in proportion to the number of components that make up the computer. Therefore, larger configurations containing many disk drives, by definition, have a lower overall MTBF.
A system's overall theoretical MTBF is calculated from the theoretical MTBFs of the components that make up the system:   MTBF  =      1                  1                  N          1                    +              1                  N          2                    +              1                  N          3                    +      …      +              1                  N          x                    
where
N=MTBF of each component
x=the number of components in the configuration
The overall MTBF of a disk drive subsystem is in direct proportion to the number of disks in the array. For example, the MTBF of a disk drive subsystem consisting of two disk drives with identical 300,000 hour MTBFs is:                               Disk          ⁢                                          ⁢          drive                                              subsystem          ⁢                                          ⁢          MTBF                      =      1                  1                  300          ,          000                    +              1                  300          ,          000                    
=150,000 hours, which is exactly half the MTBF of each disk drive
Similarly, a 10-drive configuration MTBF is one-tenth the MTBF of a single drive, or 30,000 hours, and a 100-drive configuration is reduced to 3,000 hours. Some large systems include 1000-drive (or more) storage configurations, which are likely to require that a failed drive be replaced every one to two weeks (on average).
Actual Time Before Failure Compared to MTBF
The MTBF is intended to give a statistical model of the failure times of a large population of drives. The MTBF is not a good measure for the actual time before a given drive fails. One reason is the collection of disk drives used by an installation is not always representative of a large random sample (as assumed by the statistical model). Another reason that MTBF is not a good measure for the actual time before a given drive fails is a lack of empirical data about drives in actual service.
Non-representative Samples. Installations typically purchase drives in bulk. They are likely to receive drives with sequential serial numbers because the drives were collectively shipped right from the assembly line of a factory to the large installations.
Often problems that cause a drive to fail are inadvertently introduced during the manufacturing process. The induction of dust particles and other particulate matter is a common cause of ultimate drive failures. It is typical for such problems to be introduced to a collection of sequentially manufactured drives.
Therefore, an array of drives in an installation may have a higher risk for drive failure than is represented by a theoretical MTBF calculation. Furthermore, these sequential drives are more likely to fail at approximately the same time because their inherent flaws are similar.
Lack of Empirical Data. The basis of the MTBF calculation for a disk drive is the measured failure rate in a testing facility. Often the calculation is based upon failure data of similar components in previous models. This facility may be part of the manufacturing process, such as a “burn-in” step. It may also be independent of the manufacturing process, such a dedicated testing laboratory. Both options only estimate how the drive is actually used in the field.
Drive manufacturers typically guess at the actual operating life of a drive model by knowing how many drive models were sold, knowing how many were returned after they failed, and comparing the manufacture date to the returned date. This is only a guess. This does not accurately measure the actual life and use of a drive.
Conventionally, there is no technique to accurately measure how drives are used in the field; therefore, it is difficult, at best, to determine the accuracy of the artificial failure rates estimated by development engineering.