1. Field of the Invention
This invention relates to restoring changed data onto a storage device and more particularly relates to restoring changed data onto a reactivated storage device in a redundant array of independent disks (“RAID”) system.
2. Description of the Related Art
In a contemporary computing environment, a storage system frequently writes data to and reads data from one or more storage devices through a storage controller. The storage devices are typically hard disk drives, optical disks, solid state disks, magnetic tape drives, DVD disks, CD ROM disks, or the like. Such storage devices are referred to hereinafter as disks.
One common storage system is a RAID system. In the RAID system, the disks coupled to the storage controller are configured to form a non-redundant or redundant RAID array. One common type of RAID configuration is a striped array. Striping is a method of concatenating multiple disks into one logical drive. Striping involves partitioning each array member disk's storage space into stripes. Each stripe is a number of consecutively addressed data blocks. The stripes are then interleaved across all member disks in the array in a regular rotating pattern, so that the combined space of the logical drive is composed of ordered groups of stripes. Each stripe group includes one stripe from each member disk at the same relative address. The stripes in a stripe group are associated with each other in a way that allows membership in the group to be determined uniquely and unambiguously by the storage controller.
FIGS. 1a, 1b and 1c are schematic block diagrams illustrating one embodiment of RAID arrays 100. As depicted, each member disk 110 in the RAID array 100 comprises five stripes. In FIGS. 1a and 1b, the RAID arrays 100a and 100b include four member disks: member disk 1 110a, member disk 2 110b, member disk 3 110c, and member disk 4 110d. Each RAID array 100a, 100b comprises twenty (20) stripes arranged in five stripe groups consecutively numbered 0 through 4. Each such stripe group includes one stripe from each of the four member disks 110a, 110b, 110c, and 110d in corresponding locations.
FIG. 1a shows a configuration of a non-redundant RAID array 100a resulting in a logical drive 160a containing twenty (20) consecutively addressed data stripes configured as user data and numbered 0x, 1x, . . . 12x, and 13x in a hexadecimal representation. In FIG. 1b the RAID array 100b is a redundant RAID array, known as parity RAID array, which holds, in addition to user data, check data, commonly referred to as parity and numbered P0, P1, P2, P3, and P4, distributed throughout the array, occupying one parity stripe per stripe group. The remaining stripes in the array are data stripes. As shown, the configured logical drive 160b has fifteen (15) consecutively addressed data stripes numbered 0x, 1x, . . . , Dx, and Ex in a hexadecimal representation. Check data in each stripe group is used to regenerate user data for a failed member disk when requested by a host.
FIG. 1c shows another type of redundant RAID array, a mirrored RAID array 100c, comprising member disk 1 110a and member disk 2 110b. During a write operation, the storage controller writes the same user data simultaneously on both member disks 110a and 110b in the mirrored RAID array 100c. As illustrated, the configured logical drive 160c includes five consecutively addressed data stripes numbered 0, 1, 2, 3 and 4. For a read operation, data may be read from either member disk 110a, 110b although the storage controller generally designates one member disk 110 as the master and the other as the backup.
Normally, for a logical drive 160 read or write request, the storage controller maps the specified logical drive 160 data block address to a stripe of a particular RAID member disk 110, accesses the data block, and performs the required operation on the mapped disk. Some requests may involve multiple stripes on separate member disks 110 in the stripe group, and as such, the storage controller may operate the involved member disks 110 independently in parallel. During any such operation, a disk error condition may result in a failure of one member disk 110 in the RAID array 100 to respond to the storage controller's attempt to initiate a certain action, such as a disk selection, a command transfer, a control transfer, or a data transfer. The error condition may be persistent despite a pre-specified number of retries at various operation levels including a soft device reset by the storage controller.
A disk error condition may also manifest itself as a failure to continue or complete an operation that has been started. In any case, the storage controller will designate a persistently faulty member disk 110 as offline. Conventionally, such a “dead” disk is sent back to the manufacturer for repair. In some cases in which an operable member disk is removed for a certain service action, the storage controller may also mark the absent member disk 110 offline.
If the offline member disk 110 is a member of a non-redundant RAID array 100a, for example, the member disk 10b shown in FIG. 1a, the associated logical drive 160a will be designated as offline, making data inaccessible. In such a case, generally a user-initiated data restoration will have to occur before the associated logical drive 160a is brought back in operation. If the designated offline disk 110 is a member of a parity RAID array 100b such as the member disk 110b shown in FIG. 1b, data associated with the offline member disk 110b is still accessible. The storage controller can regenerate data for the offline member disk 110b based on the contents of all the surviving member disks 110a, 110c, and 110d when a request for such data occurs. With a mirrored RAID array 100c such as that shown in FIG. 1c, data is available from a surviving disk 110a if another member disk 110b is offline. On the other hand, with either type of the redundant RAID array 100b, 100c, any user data that is destined for the offline member disk 110 on a write request is not written there although associated check data, if any, is updated on a surviving member disk 110.
Although a redundant RAID array 100b, 100c can continue to operate with one member disk 110 marked offline, the array 100b, 100c actually enters into a degraded mode of operation and the formed logical drive 16b, 160c, such as that shown in FIG. 1b, FIG. 1c, respectively, is said to be in a “degraded state” until the underlying faulty member disk 110 is replaced and all lost data resulting from the departure of the faulty member disk 110 from the array 100b, 100c is reconstructed on the new disk. The latter process is known as rebuilding. Running in a degraded mode by a RAID array 100b, 100c results in performance degradation and zero tolerance of any subsequent disk failure.
If a redundant RAID array 100b, 100c is configured with a hot standby disk, when one member disk 110 is marked offline, typically a process known as full rebuilding for the offline member disk 110 is automatically started on the hot standby disk in the background. A full rebuilding for a mirrored RAID array 100c or a parity RAID array 100b involves regenerating and writing onto the replacement disk all of the data lost from the offline member disk 110, with the replacement data including any check data being derived from all the surviving member disk(s) 110. A full rebuilding is typically time consuming and can last up to several hours for a large RAID array.
Unfortunately, many users do not purchase a spare disk 110 for each such RAID array 100 as a hot standby replacement, knowing that the spare is seldom used, that is, only during the period of a disk failure. If a redundant RAID array 100b, 100c is pre-configured with no hot standby disk, a hot swap disk, if available, inserted manually in place of the offline member disk 110 can be caused to undergo a similar full rebuilding automatically or manually.
Hard disk drive manufacturers, for example, receiving aforementioned dead hard disk drives for repair often find them quite operable following a power cycle and/or a special hard reset cycle, clearing the “fatal” error condition. With available advanced disk technology and array packaging technology, the storage controller can attempt to reactivate the offline member disk 110 so as to make the disk 110 online by means of special hard device reset protocols and/or an automated selective power cycle on the offline member disk 110 if the array enclosure is equipped with the latter capability. The success rate of thus bringing dead disks back to life is presently high enough to justify such an extended error recovery procedure for implementation in the storage controller for dead disk reactivation.
In some cases, a faulty member disk 110 marked offline may be made online by manually removing the disk 110 and re-inserting the disk 110 into the array. In cases in which an operable member disk 110 is designated offline because of the removal of the disk 110, re-insertion of the disk 110 may make the disk 110 online again. FIG. 1d is a schematic block diagram illustrating one embodiment of a high-density RAID enclosure 150. As shown, the RAID enclosure 150 includes four canisters: canister 1 130a, canister 2 130b, canister 3 130c, and canister 4 130d. Each such canister 130 holds two member disks 111, 112, 113, or 114 having individual carriers and sharing common enclosure accessories such as cooling fan, temperature sensor, and lock mechanism (none shown). The top disk of each such canister 130, for example, member disk 2a 112a of canister 2 130b, is a member disk of RAID-1 120a array. The bottom disk of the same canister 130, for example, member disk 2b 112b of canister 2 130b, is a member disk of RAID-2 120b array. RAID-1 120a and RAID-2 120b are redundant RAID arrays 100b such as shown in FIG. 1b. The two RAID arrays 120a and 120b may be operated independently or combined by data striping. In either case, each RAID array 120 can tolerate one disk failure.
If, for example, member disk 2a 112a of RAID-1 120a becomes faulty, as depicted in FIG. 1d, canister 2 130b may be removed from the RAID enclosure 150, and an available hot swap disk (not shown) may replace the faulty member disk 112a. Afterwards, canister 2 130b is re-inserted. Subsequent to the service action, both RAID-1 120a and RAID-2 120b may start full rebuilding independently, with the originally operable member disk 2b 112b restoring the online state in the latter RAID array 120b. Unfortunately, currently a time-consuming full rebuilding is likewise required of such reactivated member disk 112b in RAID-2 120b array.
From the foregoing discussion, it should be clear that a need exists for an apparatus, system, and method that track the stripes of the offline member disk 110 in a redundant RAID array 100b, 100c that were to be written on prior to making the disk 110 online by a reactivation and that execute a rebuilding only on those tracked stripes subsequent to the reactivation. Beneficially, such an apparatus, system, and method would shorten the duration of the array's degraded mode of operation and reduce the time required to complete rebuilding the reactivated member disk 110.