1. Technical Field
The present invention relates generally to data processing systems including storage devices, and more particularly to a data processing system, method, and computer program product for converting a spare storage device to a defined storage device in a logical volume.
2. Description of the Related Art
Host computer systems often connect to one or more storage controllers that provide access to an array of storage devices. In a common storage controller, microprocessors communicate the data between the storage array and the host computer system. The host system addresses a “volume” of stored data through the storage controller using a logical identifier, such as Logical Unit Number (LUN) used in SCSI (Small Computer System Interface) subsystems. The term “volume” is often used as a synonym for all or part of a particular storage disk, but it also describes a virtual disk that spans more than one disk. In the latter case, the virtual disk presents a single, contiguous logical volume to the host system, regardless of the physical location of the data in the array. For example, a single volume can represent logically contiguous data elements striped across multiple disks. A file structure can also be embedded on top of a volume to provide remote access thereto, such as Network File System (NFS) designed by Sun Microsystems, Inc. and the Common Internet File System (CIFS) protocol built into Microsoft WINDOWS products and other popular operating systems.
There are many different types of storage controllers. Some storage controllers provide RAID (Redundant Array of Independent Disks) functionality for a combination of improved fault tolerance and performance. In RAID storage controllers on an SCSI bus, for example, the host system addresses a storage element by providing the single SCSI Target ID of the RAID storage controller and the LUN of the desired logical volume. A LUN is commonly a three-bit identifier used on a SCSI connection to distinguish between up to eight devices (logical units) having the same SCSI Target ID. Currently, SCSI also supports LUNs up to 64-bits. The RAID storage controller corresponding to the provided SCSI Target ID translates the LUN into the physical address of the requested storage element within the attached storage array.
A volume ID is another form of logical identifier. Volume IDs are typically 64-bit or 128-bit globally unique persistent world wide names that correspond directly to LUNs or identifiers for other storage representations. By providing a mapping to LUNs, volume IDs can be remapped if there is a collision between LUNs in a storage system, so as to present a set of unique volume IDs to a host accessing the storage system.
The term “RAID” was introduced in a paper entitled “A Case for Redundant Arrays of Inexpensive Disks (RAID)”, Patterson et al., Proc. ACM SIGMOD, June 1988, in which five disk array architectures were described under the acronym “RAID”. A RAID 1 architecture provides “mirroring” functionality. In other words, the data for each volume of a primary storage unit is duplicated on a secondary (“mirrored”) storage unit, so as to provide access to the data on the secondary storage unit in case the primary storage unit becomes inoperable or is damaged.
A RAID 2 architecture provides error detection and correction (“EDC”) functionality. For example, in U.S. Pat. No. 4,722,085 to Flora et al., seven EDC bits are added to each 32-bit data word to provide error detection and error correction capabilities. Each bit in the resultant 39-bit word is written to an individual disk drive (requiring at least 39 separate disk drives to store a single 32-bit data word). If one of the individual drives fails, the remaining 38 valid bits can be used to construct each 32-bit data word, thereby achieving fault tolerance.
A RAID 3 architecture provides fault tolerance using parity-based error correction. A separate, redundant storage unit is used to store parity information generated from each data word stored across N data storage units. The N data storage units and the parity unit are referred to as an “N+1 redundancy group” or “drive group”. If one of the data storage units fails, the data on the redundant unit can be used in combination with the remaining data storage units to reconstruct the data on the failed data storage unit.
A RAID 4 architecture provides parity-based error correction similar to a RAID 3 architecture but with improved performance resulting from “disk striping”. In disk striping, a redundancy group is divided into a plurality of equally sized address areas referred to as blocks. Blocks from each storage unit in a redundancy group having the same unit address ranges are referred to as “stripes”. Each stripe has N blocks of data of different storage devices plus one parity block on another, redundant storage device, which contains parity for the N data blocks of the stripe. A RAID 4 architecture, however, suffers from limited write (i.e., the operation of writing to disk) performance because the parity disk is burdened with all of the parity update activity.
A RAID 5 architecture provides the same parity-based error correction as RAID 4, but improves “write” performance by distributing the data and parity across all of the available disk drives. A first stripe is configured in the same manner as it would be in RAID 4. However, for a second stripe, the data blocks and the parity block are distributed differently than for the first stripe. For example, if N+1 equals 5 disks, the parity block for a first stripe may be on disk 5 whereas the parity block for a second stripe may be on disk 4. Likewise, for other stripes, the parity disks are distributed over all disks in the array, rather than in a single dedicated disk. As such, no single storage unit is burdened with all of the parity update activity.
A RAID 6 architecture is similar to RAID 5, with increased fault tolerance provided by independently computed redundancy information in a N+2 redundancy group. A seventh RAID architecture, sometimes referred to as “RAID 0”, provides data striping without redundancy information. Of the various RAID levels specified, RAID levels 0, 1, 3, and 5 are the most commonly employed in commercial settings.
In the prior art, when a primary disk fails, the data that had been stored on the failed drive is incorporated on one of the drives that had been assigned the role of “spare” drive. The storage controller maintains a list of drives that are designated as spare drives. In order for a disk to be used as a “spare” drive in accordance with one of the RAID architectures, the drive must be designated as a “spare”. Unused, unavailable drives may not be used as spares.
Once the data is incorporated on the spare, the spare drive remains assigned the role of “spare” drive. The volume definition includes a reference to a failed drive. The volume configuration is non-optimal since it references a failed drive. Non-optimal processing may continue with the data stored on the spare drive. In order to return to optimal processing, a new primary drive must be used to replace the failed drive. Once the storage controller detects the removal of a failed drive and the insertion of a replacement drive, the storage controller updates the volume definition to reference the replacement drive, and starts copying data from the spare to the replacement drive. The spare drive then continues to be used as a spare drive. FIGS. 5A–5B and 6 depict this process in more detail.
FIGS. 5A and 5B illustrate block diagrams of a storage subsystem in accordance with the prior art. In the depicted example, storage subsystem 500 is a disk drive system including a controller 502. Controller 502 controls primary disk drives 504, 506, and 508. Disk 510 is a spare that is used in accordance with a RAID level 1, 2, 3, 4, 5, or 6. In the example depicted by FIG. 5A, disk 508 has failed. According to the prior art, when controller 502 detects that disk 508 has failed, controller 502 integrates spare 510 by constructing the data that had been stored on disk 508. The data stored on disks 504 and 508 is used to construct the data that had been stored on disk 508 in accordance with the RAID level implemented by the storage subsystem.
Once spare 510 is integrated, system 500 may continue to operate with disks 504, 506, and spare 510. However, the processing will be non-optimal because the logical volume definition includes a reference to a failed drive.
A logical volume definition typically includes a logical volume name or identifier, an identifier that identifies one or more physical drives that make up the logical volume identified by the logical volume name, and a logical unit identifier that is used by a host to communicate with the logical volume. For each logical volume, when the RAID standard is used, an indication of the RAID level for each logical volume is also included. Other information may also be included.
When a volume is first created, the user generally specifies a list of drives on which the volume is to be defined. Since a volume definition includes a list of drives, the act of assigning a drive to a volume adds a reference to that drive to the list of drives in the volume definition. Similarly, removing a drive, i.e. to remove a failed drive from the volume definition, deletes the reference to that drive within the volume definition. When a drive is included in a volume definition, the drive is called an “assigned” drive.
In order to put the system back in an optimal state, a user must replace primary drive 508 with a new primary drive 512. When the storage controller detects the replacement of the failed drive with a new, replacement drive, the storage controller will copy the data from spare 510 to new primary drive 512. After the data is copied to new primary drive 512, spare 510 then continues to be used as a spare. During this process, spare 510 is assigned within the storage controller as a spare drive. When the controller detects the replacement of a failed drive with a replacement drive, the controller updates the volume definition to include a reference to the replacement drive and then copies data from the spare to the replacement drive. The spare is then returned to a stand-by mode. Although once the data is copied and processing continues, the processing is non-optimal because the volume definition still includes a reference to a failed drive.
FIG. 6 illustrates a high level flow chart which depicts copying data from a spare drive to a new drive that replaced a failed drive in accordance with the prior art. The process starts as depicted by block 600 and thereafter passes to block 602 which illustrates a determination of whether or not a drive in the array has failed. If a determination is made that none of the disks has failed, the process passes to block 604 which depicts continuing optimal storage subsystem processing.
Referring again to block 602, if a determination is made that one of the drives has failed, the process passes to block 606 which illustrates the storage controller alerting the user that a drive has failed. Next, block 608 depicts the storage controller integrating the spare drive. When a drive is integrated, the data that was stored on the failed drive is reconstructed using the remaining drives. The reconstructed data is then stored on the spare drive.
Thereafter, block 610 illustrates the storage controller continuing processing. This processing, however, is non-optimal because the logical volume definition has not been changed to remove the reference to the failed drive. Next, block 612 depicts a determination of whether or not the storage controller has detected the replacement of the failed drive with a new, replacement drive. If a determination is made that no replacement drive has been detected, the process passes back to block 610. Referring again to block 612, if a determination is made that a replacement drive has been detected, the process passes to block 614 which illustrates the storage controller copying the data from the spare to the replacement drive. Next, block 616 depicts the storage controller returning the spare drive to a stand-by mode. The process then passes to block 618 which illustrates a continuation of non-optimal processing. Next, block 620 depicts a determination of whether or not the volume definition has been updated to remove the reference to the failed drive. If a determination is made that the volume definition has not been updated, the process passes back to block 618 which illustrates the continuation of non-optimal processing. Referring again to block 620, if a determination is made that the volume definition has been updated, the process passes back to block 604 which depicts the continuation of optimal processing.
Therefore, a need exists for a system, method, and computer program product for utilizing a spare storage device as a defined, replacement storage device in a logical volume such that optimal processing may be continued once a storage device has failed.