1. Field of the Invention
The invention relates generally to storage systems and more specifically relates to methods and structure in a storage device for assuring that stale data on a storage device that is write-protected cannot be read by a controller in a RAID storage system.
2. Discussion of Related Art
In RAID (Redundant Array of Independent Drives) storage systems, logical volumes are defined and managed. A logical volume comprises portions of two or more disk drives and stores redundancy information along with user data provided by attached host systems. The redundancy information is generated and stored in conjunction with user data such that failure of a single disk drive does not preclude ongoing access to the data on the logical volume. Rather, the logical volume may continue operation until such time as the failed drive is replaced and the full redundancy features are restored (i.e., a “rebuild” operation onto a “hot swap” disk drive).
Various “levels” of RAID storage management are standardized in the storage industry. In RAID level 1 management, user data on one storage device is mirrored on a second storage device. In RAID level 5, user data and redundancy information is distributed (“striped”) over a plurality of storage devices (at least 3 devices). In RAID level 5, the redundancy information is the exclusive-OR (“XOR”) sum of the user data. Thus, a “stripe” of data in RAID level 5 management includes a plurality of blocks of user data and a corresponding parity block that is the XOR sum of the user data in the related blocks of the stripe. In like manner, RAID level 6 defines a stripe of data as a plurality of blocks of user data and two blocks of redundancy information—typically a first parity block that is the XOR sum or the other blocks and a second block that may be also XOR parity, Galois field accumulation (also referred to as Galois field multiplication or “GFM”), or other redundancy computations. RAID level 6 can therefore keep a logical volume operating even in the event of two disk drive failures of the logical volume.
Modern storage devices (e.g., rotating disk drives) sense present failures of the device and may include logic to sense a potential impending failure before the device actually fails. As used herein, “failing” device and “failed” device refer to such a storage device that has either sensed a current failure or has sensed a possible impending failure. In such a state, in some embodiments, the storage device may remain able to read significant portions of stored data for a period of time (if not all stored data) before the device ultimately fails. For example, a common failure mode of multi-head, rotating disk drives is the total failure of one of the multiple read/write heads. If, for example, a drive has 4 heads and a single head has failed, then 75% of the data is still accessible through the remaining three good heads. In this scenario, a failing device may be programmed/configured into a special recovery mode in which any data that is accessible without error can be read but no data can be written (i.e., the storage device is, in effect, in a write-protected mode). Using the recovery mode, much if not all data on the failing device may be copied to a substitute device before the device fails further (or actually fails in devices that may sense impending failure). This copying to a replacement (hot swap) device may dramatically reduce the time required to restore a volume to normal operation as compared to a full rebuild process because only those portions that could not be read from the failing device need to be rebuilt using the redundancy information.
Some storage device manufacturers have proposed and implemented related features in storage devices in what is sometimes referred to as a “rebuild assist mode”. In this mode, the failing device is operable to reconfigure itself to write protect data and to reconfigure itself to quickly detect a likely failed block on read requests (i.e., without extensive retries of the potential read failure). The rapid read failure detection helps reduce the overhead latency time associated with normal read operation of the storage device internally retrying read failures of a block. Using the rapid read failure detection, a copy process may more rapidly copy all readable data from the failing device onto a hot swap spare storage device. In this rebuild assist mode, the failing device processes read requests from attached system essentially normally (but with minimal retry logic employed) for all readable data blocks to be copied to the hot swap replacement storage device. The storage device is further operated in this rebuild assist mode such that the data is write protected. The logical volume that comprises the failing device may continue to operate essentially normally processing read requests and copying data to the hot swap device in the background until a write request is attempted. A write request directed to the logical volume will fail with respect to the data to be written to the failing device (since it is write protected in the rebuild assist mode) but, in some embodiments, may succeed with respect to other blocks of the stripe written to other storage devices of the logical volume.
When the write operation to the failing device in recovery mode fails, the data presently stored at the logical block address that failed on the failing device may still be readable but the data is not up to date with respect to other data of the related RAID stripe (or with respect to the mirrored data in the case of RAID level 1 management). Such data is typically referred to as “stale” data. In present devices that use such a recovery mode to read data from a failing device, a write failure may force a complete rebuild of the hot swap device to avoid the use of stale data rather than rely on the faster recovery mode to read all data from the device sensing failure. A full rebuild can be extremely time consuming in present day high capacity storage devices. Further, it is impractical for a RAID controller coupled with a failed storage device to maintain a log of writes attempted to the failed device—especially during the extensive time required to rebuild a hot spare device to replace the failed device. A RAID storage controller may manage many such devices. The memory required on the RAID controller to log attempted writes to a failed storage device for an extended period of time could be very large even for a single failed device let alone multiple failed devices of the tens or hundreds of devices potentially managed by a high-end RAID storage controller.
Thus it is an ongoing challenge to efficiently manage sensing of failure of the storage device and resultant processing to build an up to date spare storage device to replace the failing device.