1. Field of the Invention
This invention relates generally to data storage, and more particularly provides a system and method for reducing power consumption, increasing reliability and/or reducing administrative overhead of data storage systems.
2. Description of the Background Art
Electronic data is stored on storage media, such as compact disks, optical disks, ATA disks and magnetic tapes. Different types of recording media differ in access speed and reliability. As always, higher quality recording media come at a price. Faster and more reliable recording media are expensive. Slower and less reliable recording media are less expensive.
For example, SCSI drives are faster and more reliable but expensive. The Ultra 320 SCSI disk drive has a speed of 320 MBytes per second. SCSI drives also include paced data transfer, a free running clock, a training pattern at the beginning of a transfer series, skew compensation, driver pre-compensation and/or optional receiver adjustable active filter (AAF). See http://www.scsita.org/aboutscsi/and http://www.scsita.org/aboutscsi/ultra320/UItra32O_WhitePaper.pdf.
Although inexpensive, ATA drives are slower and less reliable. For example, serial ATA is a disk-interface technology developed by a group of the industry's leading vendors known as the Serial ATA Working Group to replace parallel ATA. The Serial ATA 1.0 specification, released in August 2001, indicates that serial ATA technology will deliver 150 Mbytes per second of performance. See http://www.t13.org/and http://www.serialata.com/.
To increase reliability of inexpensive systems, system designers have developed systems using what is currently termed “Redundant Arrays of Independent Disks” (RAID), e.g., RAID1. Originally, it will be appreciated that RAID stood for “Redundant Arrays of Inexpensive Disks.” RAID is a form of storage array in which two or more identical data copies are maintained on separate media, typically on inexpensive magnetic disk drives. The first data storage medium acts as the primary database, responding to all user access requests. At the same time, the second data storage medium backs up the first data storage medium, so that the second data storage medium could take over all operations should the first data storage medium fail. It will be appreciated that RAID1 is also known as RAID Level 1, disk shadowing, real-time copy, and t1 copy. See http://www-2.cs.cmu.edu/˜garth/RAIDpaper/Patterson88.pdf. Lower quality data storage media are less reliable and not fit for continuous operation. Mean time before failure (MTBF) is short. Accordingly, in a RAID system, it is not uncommon for drives to fail. System administrators have to watch over the systems constantly to assure proper working order of the redundant drives.
As is well known, storage media have data capacity limits. Accordingly, vast amounts of data typically must be stored on multiple disks or tapes, especially if lower quality, less expensive magnetic disks as in RAID systems are used. Since it is necessary to use many disks and tapes, power consumption is typically high.
To reduce administrative overhead and improve reliability, techniques have been developed to predict failure of disk drive systems. One such technique is termed “S.M.A.R.T.” (Self-Monitoring Analysis and Reporting Technology). Namely, software on each disk drive monitors the disk drive for failure or potential failure. If a failure or potential failure is detected, the software on the disk drive raises a “red flag.” A host polls the disk drives (sends a “report status” command to the disk drives) on a regular basis to check the flags. If a flag indicates failure or imminent failure, the host sends an alarm to the end-user or system administrator. This allows downtime to be scheduled by the system administrator to allow for backup of data and/or replacement of the failing drive. See http://www.seagate.com/docs/pdf/whitepaper/enhanced smart.pdf.
Current solutions to storage medium failure include automatic swap and hot standby. Automatic Swap is the substitution of a replacement unit for a defective one, where substitution is performed automatically by the system while it continues to perform normal functions (possibly at a reduced rate of performance). Automatic swaps are functional rather than physical substitutions, and thus do not require human intervention. Ultimately, however, defective components must be replaced by the system administrator (either by a cold, warm or hot swap).
Hot Standby is a redundant component in a failure tolerant storage subsystem that is powered and ready to operate, but which does not operate as long as its companion component is functioning. Hot standby components increase storage subsystem availability by allowing systems to continue to function when a component (such as a controller) fails. When the term hot standby is used to denote a disk drive, it specifically means a disk that is spinning and ready to be written to, for example, as the target of a rebuilding operation.
It will be appreciated that an archiving system which consumes less power is desirable. Systems with reliable storage media and longer MTBFs are also desirable. Further, storage systems utilizing cheaper components but maintaining the increased reliability of more expensive counterparts is also desirable. Storage systems which reduce administrative overhead are also desirable.