The present invention relates generally to mass data storage systems. More specifically, the present invention includes a method for reconfiguration of disk arrays.
Disk arrays are data storage system that combine two or more independent disks into a single logical storage device. Compared to traditional devices, disk arrays offer increased performance, capacity and, in some cases, fault tolerance. To understand modern disk arrays, it is helpful to understand the RAID concept. RAID is an acronym originally described in a paper written by Patterson, Gibson, and Katz, entitled xe2x80x9cA Case for Redundant Arrays of Inexpensive Disks, or RAID,xe2x80x9d published by the University of California, Berkeley in 1987. In this paper, the authors describe a taxonomy of disk array types, known as RAID levels one through five. Subsequently, the RAID taxonomy has been extended to include RAID level zero, as well as several other RAID levels.
For RAID level zero, data is split into blocks known as xe2x80x9cstripsxe2x80x9d and spread in xe2x80x9cstripesxe2x80x9d over the system""s disks. For example, in a four disk RAID level zero system having a one-kilobyte strip size, each four kilobytes stripe of data would be written as four one-kilobyte strips, one on each of the four disks. Three kilobytes of data would be written as three one-kilobyte strips, divided between three of the four drives. RAID level zero offers increased performance and capacity, but fails to provide any degree of fault tolerance.
In RAID level one systems, each disk is paired with a second disk. The operation of the disk pairs is coordinated so that each drive is a mirror image of its pair. If either disk fails, the remaining disk ensures that data is not lost. In this way, RAID level one offers a degree of fault tolerance not found in independent disks.
Like RAID level zero systems, RAID level three systems stripe data across two or more disks. RAID level three systems also include a separate parity disk. Each bit stored on the parity disk is the parity of the bits, stored in the same location, on the remaining data drives. Thus, the nth bit stored on the parity drive is the parity bit of the nth bits stored on each remaining drive. Typically, the parity bits are formed by performing an exclusive-or operation. Thus, the nth bit stored on the parity disk is formed as the exclusive-or of the nth bits stored on the remaining disks. The use of a parity drive gives RAID level three systems a degree of fault tolerance. In these systems, if any single drive fails (including the parity drive) the remaining drives may be used to reconstruct or interpolate the data stored on the failed drive.
RAID level five systems stripe data using the strip-by-strip basis of RAID level zero. RAID level five systems also store parity bits to increase fault tolerance. Unlike the RAID level three, however, RAID level five systems do not include a separate parity disk. Instead, for RAID level five, parity bits are stored in a rotating fashion across all of the disks included in the system. For example, in a three disk RAID level five system, the parity bits for the first strip of data might be stored on the first drive. The parity bits for the next strip of data would then be stored on the second drive. The parity bits for the third strip of data would then be stored on the third drive, and so on. RAID level five systems offer the same degree of fault tolerance provided by RAID level three systems. In use, however, RAID level five systems often provide higher performance.
RAID level ten systems are a combination of techniques borrowed from RAID level zero and RAID level one systems. These systems strip data across a series of disks in the same way as RAID level zero systems. These disks are known as the primary disks. In addition, RAID 10 systems include a mirroring disk for each primary disk. RAID level ten systems provide the high speed access of RAID level zero systems and the fault-tolerance of RAID level one systems.
Reconfiguration is a common occurrence in the operation of RAID systems. For example, RAID systems may be reconfigured to increase or decrease storage capacity by adding or subtracting disks. Alternately, RAID systems can be reconfigured to change fault tolerance, performance and storage capacity by changing RAID levels. RAID systems can also be reconfigured to tune performance by changing other parameters, such as the strip size used to perform striping.
Reconfiguring a RAID system generally requires that data be moved from the old system to the new system. In many cases, this data movement is performed by making a backup copy of the data using a tape drive or other backup device. The RAID system is then reconfigured by adding or subtracting drives or by changing other parameters. The data is then restored using the tape drive or other backup device.
The backup and restore method for moving data has several disadvantages. Chief among these disadvantages is the fact that the involved RAID system is unavailable for use until the operation has completed. In many cases, it may be impractical or impossible to idle the involved RAID system for the required time period. The backup and restore operation also requires that a human operator perform a number of manual steps. Each of these steps is error prone and increases the probability that important data may be destroyed.
Data migration is an alternative to the backup and restore method of data movement. Fundamentally, data migration involves copying each data bit from its pre-reconfiguration location to its post-reconfiguration location. During data migration, access to the RAID system may be maintained. In this way, a serious limitation of the backup and restore method of data movement is avoided. Unfortunately, data migration requires a great number of individual disk operations and is, therefore, often quite time consuming. During this time, access to the RAID system may be slowed dramatically. Thus, even data migration may be inconvenient or impractical. As a result, there is a need for systems that expedite the reconfiguration of RAID systems.
An embodiment of the present invention includes a method for reconfiguration of disk arrays. An exemplary environment for the present invention includes a reconfigurable RAID system. The RAID system includes a disk array composed of one or more data disks, zero or more mirror disks, and zero or more parity disks. The disk array operates under the control of an array controller. The array controller functions as an interface between the RAID system and any host computer to which the RAID system is connected. The array controller includes a processor and a memory system. The memory system includes an area of non-volatile ram (NVRAM) that remains valid during intentional and unintentional system shutdowns.
Data storage in the RAID system is characterized by a set of reconfigurable parameters including the number of data and parity disks included in the disk array, RAID level, strip size and stripe size. The method of the present invention allows the RAID system to move from a source configuration to a destination configuration. The move from the source configuration to the destination configuration may include changes in RAID level, number of data and parity disks included in the disk array or changes in parameters such as strip size and stripe size. Reconfiguration can also be performed to move data from one or more disks to one or more replacement disks. Preferably, the reconfiguration method is implemented as a program stored in the memory system of the array controller and executed by the array controller processor.
For the method of the present invention, the array controller examines the original configuration of the disk drive array (the source configuration) and the desired configuration (the destination configuration). Based on this examination, the array controller determines if the reconfiguration process may be optimized. To optimize the reconfiguration process, the array controller determines if a combination of changes to the parameters stored in the reserved area and possible rebuilding operations can replace the migration process. For example, reconfiguration of a RAID zero, three disk system to a RAID three, four disk system involves adding a single disk to act as a parity drive. This can be accomplished by updating the parameters stored in the reserved area to reflect the new configuration and performing a rebuild operation to initialize the added disk as a parity drive. If this is possible, the reconfiguration process is modified to eliminate data migration. In this way, the present invention provides a method that dramatically increases the speed of reconfiguration for some source and destination configurations.
Advantages of the invention will be set forth, in part, in the description that follows and, in part, will be understood by those skilled in the art from the description herein. The advantages of the invention will be realized and attained by means of the elements and combinations particularly pointed out in the appended claims and equivalents.