A digital data archival storage system stores large amounts digital data on some type of media. For faster data ingest, easier management, and quick retrieval presently hard disk drive arrays are used. Hard disk drives consume power when running and are subject to mechanical wear in the motors used for spinning the disks and for driving read and write heads. In addition, the mechanical activity of the motors and the drives produces heat. Additional energy is used to cool the drives. Some disk drives can be operated at different spin speeds. Higher speeds provide faster read and write operations, while lower speeds use less power and produce less heat. Some disk drives also have a standby state in which the disk stops spinning but the hard disk drive controller is still powered to receive read and write requests. Other types of mass digital storage devices have different reliability and latency characteristics but typically are affected by power cycling and low power modes.
Power consumption can be minimized by simply shutting down the hard disk drives whenever they are not in use. However, this first causes delays when accessing the disks because the disks must first spin up to full speed before being accessed. This can take as much as a minute. Second, turning the disk drives on and off increases the wear on the disk drive's mechanical system and makes a drive failure more likely. Reliability of the drives can be compromised if devices are put into a low-power or power-down state too often or for long periods of time.
A common power-management operation in storage systems is placing the storage devices into a low-power mode. Low-power mode can mean reducing spin speed and supply voltages, turning specific components in the drive off, or turning an entire hard disk drive off. Powering-off components within a storage device or powering-off the entire storage device can reduce the operational life of the device. In addition, leaving a storage device powered off for an extended period of time can also reduce the operational life and reliability of the device. As a consequence, storage device power management and reliability are at odds with one another.
One power management approach is to power down devices and to ensure reliability by periodically scrubbing each device individually. Spun-down disks are periodically activated and a small amount of data is read off the drive to check drive health. If the drive is deemed unhealthy, it can be proactively failed or it can be designated as not healthy enough for power management. Any read errors occurring during the scrub can be automatically fixed using intra-disk parity. This approach focuses on the health of individual drives and the cross-disk data integrity checks to test data integrity are typically done offline.