The need to store digital files, documents, pictures, images and other data continues to increase rapidly. In connection with the electronic storage of data, systems incorporating more than one storage device have been devised. In general, using a number of storage devices in a coordinated fashion in order to store data can increase the total storage volume of the system. In addition, data can be distributed across the multiple storage devices such that data will not be irretrievably lost if one of the storage devices (or in some cases more than one storage device) fails. An additional advantage that can be achieved by coordinating the operation of a number of individual storage devices is improved data access and/or storage times. Examples of systems that can provide such advantages can be found in the various RAID (redundant array of independent disks) levels that have been developed.
RAID systems have become the predominant form of mass storage systems in most computer systems today that are used in applications that require high performance, large amounts of storage, and/or high data availability, such as transaction processing, banking, medical applications, database servers, internet servers, mail servers, scientific computing, and a host of other applications. A RAID controller controls a group of multiple physical storage devices in such a manner as to present a single logical storage device (or multiple logical storage devices) to a computer operating system. RAID controllers employ the techniques of data striping and data redundancy to increase performance and data availability.
Not all RAID levels provide data redundancy, however. For example, a RAID 0 array uses a striping technique to store data stripe-wise across multiple storage devices, but does not provide a copy of the data stored elsewhere on storage devices of the array that can be used to reconstruct data if a storage device fails. RAID levels that provide redundancy are divided into two categories: those that are parity-based, and those that are mirror-based. Parity-based RAID levels calculate parity from data that is written to the RAID array, and store it on a different storage device than the storage devices used to store the data itself. Parity-based RAID levels include RAID levels 3, 4, 5, 6, 30, 40, and 50. Mirror-based RAID levels store a copy of data written to the RAID array to a different storage device from the device used to store the data itself. Mirror-based RAID levels include RAID levels 1 and 10.
According to RAID level 1, data stored in a primary storage device is mirrored to a secondary storage device. Therefore, RAID level 1 requires at least two storage devices to implement. Furthermore, if more than two storage devices are desired, additional storage devices are added in pairs. That is, RAID level 1 requires an even number of storage devices. During normal operation, write operations result in a primary copy of data being written to the primary storage device and a mirrored copy being written to the secondary storage device, and read operations are made with respect to the copy of data on either the primary or secondary storage device. If one storage device within a RAID level 1 array fails, data stored on that storage device can be rebuilt onto a replacement storage device by copying the data stored on the failed storage device's companion storage device to the replacement storage device. Another example of a mirror-based RAID level is RAID level 10. RAID level 10 mirrors a striped set of storage devices, and requires a minimum of four storage devices to implement. Data is striped across multiple storage devices, which improves I/O performance for RAID 10 compared with RAID 1.
Other RAID levels combine data storage devices with parity storage devices, which is either stored on a dedicated parity storage device or distributed among data storage devices. Examples of such arrangements include RAID levels 3, 4, 5, 6, 30, 40, and 50. Although such arrangements provide for fault tolerance, and can provide somewhat improved I/O performance, they all require at least three storage devices to implement, and require fairly complex controller and parity generation circuitry or software. All of the parity-based RAID levels can tolerate a single storage device failure, but RAID 6 can tolerate up to two simultaneous storage device failures.
RAID subsystems commonly employ spare storage devices. Spare storage devices are able to replace storage devices identified by the RAID controller, software, or system administrator as failed or failing storage devices. Rebuild of data from a failed or failing storage device to an available spare storage device may occur as directed by a system administrator, or as a result of an automated rebuild process within the RAID controller or software.
In computer terminology, a check condition occurs when a SCSI device needs to report an error. SCSI communication takes place between an initiator and a target. The initiator sends a command to the target which then responds. SCSI commands are sent in a Command Descriptor Block (CDB). At the end of the command the target returns a status code byte which is usually 00h for success, 02h for a check condition (error), or 08h for busy. When the target returns a check condition in response to a command, the initiator usually then issues a SCSI request sense command in order to obtain more information. During the time between the reporting of a check condition and the issuing of a request sense command, the target is in a special state called contingent allegiance.
In most cases, a storage device will detect and correct internal media errors via Error Correction Codes (ECC) and various retry mechanisms. When the storage device is unable to correct the data, it will post a check condition in final status. The controller will then issue a request sense command to the storage device and process the sense data. If the sense data indicates a media error, the controller can correct the bad data using RAID parity data for a parity-based array and RAID mirror data for a mirror-based array. Data is read from the good storage devices (the storage devices not reporting the media error), data is generated corresponding to the data on the storage device reporting the media error, and data is written to an available spare storage device—which then replaces the storage device with the media error in the redundant array.
Although redundant RAID arrays protect against single storage device failures quite well, there are other classes of problems where storage devices do not detect and report an error, but instead return data that is different from the data that was previously written to the storage device at the location now being read. Occasionally, a storage device will fail in such a manner that it is unable to detect that it is returning corrupted data to the controller. This may be the result of a storage device not writing properly to media within the storage device, or by storing the data properly, but changing the data in some fashion between reading the data from the media and transmitting the data to the controller. For corrupted reads, the observed failure mode has been dropped bits. The failure is transient, that is, given multiple reads of the same block(s), there may be some good reads, and even subsequent bad reads may have dropped different bits from previous bad reads. Typically, just one bit is dropped in a stream of data, whether that stream is a single block or multiple blocks. Generally, there are no other indicators that provide possible identification of the bad storage device.
Without error indication from the storage device, the controller in turn passes this bad data to the requesting host computer. This may possibly result in a host computer software crash, bad data being used by a host computer, or bad data passed to client computers. It is therefore advantageous to find and replace any storage devices that exhibit this type of storage device failure at the storage controller level, before reaching a host computer.
In the context of a RAID array employing multiple storage devices per logical storage device, the challenge is in identifying which of a group of storage devices is the storage device that is corrupting data. It is presumed that a single storage device may be replaced in a parity-based or mirror-based RAID array without data loss, or up to two storage devices in a RAID 6 array. Therefore, what is needed is a method to detect unreported data corruption, and automatically identify storages device(s) causing such unreported corruption.