Many computer-related systems now include redundant components for high reliability and availability. For example, in a RAID storage system, an enclosure includes an array of hard disk drives (HDDs) which are each coupled through independent ports to both of a pair of redundant disk array switches. One of a pair of redundant sub-processors may be coupled to one of the switches while the other of the pair of sub-processors may be coupled to the other switch. Alternatively, a single sub-processor may be coupled to both switches and logically partitioned into two images, each logically coupled to one of the switches. The sub-processors handle power notification and other enclosure management functions. Each switch may also be coupled through a fabric or network to both of a pair of redundant RAID adapters external to the enclosure; the adapters communicate over the fabric with the sub-processors through ports in the switches. The system may include additional enclosures, each coupled in daisy-chain fashion in the network to the disk array switches of the previous enclosure.
Conventionally, numerous tasks are performed by the RAID adapter(s) which are disk-related. In order to access a drive, the RAID adapter initiates an “open” to the drive. Whatever switches are located between the adapter and the drive are configured to establish the necessary ports dedicated to connect the drive to the adapter. If the selected drive is at the “bottom” of a series of disk enclosures, switches in all of the enclosures above it are required to be involved to establish the path from the adapter to the drive. And, while access is being made over the path, the participating ports cannot be used to transmit data to or from other drives.
For example, when disk drive firmware is to be updated, the RAID adapter opens a path with a drive to take the drive offline, downloads the firmware to update the drive, then brings the drive back online. The procedure is performed for each drive, requires substantial adapter and fabric resources and, all of the ports that were configured to establish the path from the adapter to the device being updated cannot be used to access other devices, is disruptive to the system as a whole.
By way of another example, “data scrubbing” is performed to examine a disk for latent media errors. If particular data which has been written to an area of the disk which is defective is not read very often, it may be a long time before the error is detected, possibly resulting in a data loss. By systematically accessing sectors of each disk, such defects may be identified and corrective action taken before an actual drive failure. However, again the procedure requires substantial adapter and fabric resources and tends to be disruptive to the system as a whole.
In a third example, the accidental release of sensitive data is a growing concern for any of a number of industries. Such data may include classified documents, industrial secrets and financial records, to name a few. When hard drives on which such data has been stored are taken out of service, the data may still be stored on the disk, even if it has been “erased”. Moreover, performing a low level reformat is not sufficient to ensure that the data has been irrevocably destroyed since it may still be possible to recover the data from residual magnetism on the disk. Consequently, there are a number of protocols which may be implemented to ensure the destruction of data. In addition to physically destroying the disk by means of grinding and degaussing, algorithms have been developed prescribing multiple overwrites of specified data patterns in defined sequences to every location on the disk. Some methods require verification that the data stored in the locations of the disk is the pattern last updated. Strict compliance with the procedures and validation of the method is required to consider the data securely erased. Typically, utilities are written for operating systems to perform the task. When implemented in a RAID array, the use of substantial adapter and system resources is required.
Consequently, a need remains to be able to perform various disk-related maintenance operations without burdening the RAID adapters and without disrupting access to the rest of the disk array or to the network.