A storage system typically comprises one or more storage devices into which data may be entered, and from which data may be obtained, as desired. The storage system includes a storage operating system that functionally organizes the system by, inter alia, invoking storage operations in support of a storage service implemented by the system. The storage system may be implemented in accordance with a variety of storage architectures including, but not limited to, a network-attached storage environment, a storage area network and a disk assembly directly attached to a client or host computer. The storage devices are typically disk drives organized as a disk array, wherein the term “disk” commonly describes a self-contained rotating magnetic media storage device. The term disk in the context is synonymous with hard disk drive (HDD) or direct access storage device (DASD).
Storage of information on the disk array is preferably implemented as one or more storage “volumes”, defining an overall logical arrangement of disk space. The disks within a volume are typically organized as one or more groups, wherein each group is operated as a Redundant Array of Independent (or Inexpensive) Disks (RAID). Most RAID implementations enhance the reliability/integrity of data storage through the redundant writing of data “stripes” across a given number of physical disks in the RAID group, and the appropriate storing of redundant information with respect to the striped data. The redundant information may thereafter be retrieved to enable recovery of data lost when a storage device fails.
In the operation of a disk array, it is anticipated that a disk can fail. A goal of a is high performance system is to make the mean time to data loss as long as possible, preferably much longer than the expected service life of the system. Data can be lost when one or more disks fail, making it impossible to recover data from the device. Typical schemes to avoid loss of data include mirroring, backup and parity protection. Mirroring stores the same data on two or more disks so that if one disk fails, the “mirror” disk(s) can be used to serve (e.g., read) data. Backup periodically copies data on one disk to another disk. Parity schemes are common because they provide a redundant encoding of the data that allows for loss of one or more disks without the loss of data, while requiring a minimal number of disk drives in the storage system.
Parity protection is used in a computer system to protect against loss of data on a storage device, such as a disk. A parity value may be computed by summing (usually modulo 2) data of a particular word size (usually 1 bit) across a number of similar disks holding different data and then storing the results on the disk(s). That is, parity may be computed on 1-bit wide vectors, composed of bits in predetermined positions on each of the disks. Addition and subtraction on 1-bit vectors are an equivalent to exclusive-OR (XOR) logical operations; these addition and subtraction operations can thus be replaced by XOR operations. The data is then protected against the loss of any one of the disks, or of any portion of the data on any one of the disks. If the disk storing the parity is lost, the parity can be regenerated from the data. If one of the data disks is lost, the data can be regenerated by adding the contents of the surviving data disks together and then subtracting the results from the stored parity.
Typically, the disks are divided into parity groups, each of which comprises one or more data disks and a parity disk. The disk space is divided into stripes, with each stripe containing one block from each disk. The blocks of a stripe are usually at equivalent locations on each disk in the parity group. Within a stripe, all but one block contain data (“data blocks”) with the one block containing parity (“parity block”) computed by the XOR of all the data. If the parity blocks are all stored on one disk, thereby providing a single disk that contains all (and only) parity information, a RAID-4 implementation is provided. If the parity blocks are contained within different disks in each stripe, usually is in a rotating pattern, then the implementation is RAID-5. The term “RAID” and its various implementations are well-known and disclosed in A Case for Redundant Arrays of Inexpensive Disks (RAID), by D. A. Patterson, G. A. Gibson and R. H. Katz, Proceedings of the International Conference on Management of Data (SIGMOD), June 1988.
The storage operating system of the storage system may implement a file system to logically organize the information stored on the disks of a volume as a hierarchical structure of directories, files and blocks. The storage operating system may also include a RAID subsystem that manages the storage and retrieval of the information to and from the disks in accordance with input/output (I/O) operations. There is typically a one-to-one mapping between the information stored on the disks in, e.g., a disk block number (DBN) space, and the information organized by the file system in, e.g., volume block number (VBN) space. The file system consists of a contiguous range of VBNs from zero to N, for a file system of size N-1 blocks. The storage operating system may further include administrative interfaces, such as a user interface, that enable operators (system administrators) to access the system in order to implement, e.g., configuration management decisions.
Configuration management in the RAID subsystem generally involves a defined set of modifications to the topology or attributes associated with a storage array, such as a disk, a RAID group (parity group), a volume or set of volumes. Examples of these modifications includes, but are not limited to, disk addition, disk failure handling, volume splitting, volume online/offline and changes to a RAID group size. The RAID group size is the maximum number of disks that may be contained within a RAID group. For example, if a RAID group size is “3”, then the number of disks in the group can be less than or equal to 3, but not more than 3. The RAID group size is typically a property of a volume, such that all RAID groups of the volume typically have the same RAID group size. When the RAID group reaches the maximum number, a new RAID group is created upon the addition of new disks.
Volume capacity is typically linked to the linear growth of the RAID groups that, is in turn, are organized linearly within the VBN space of a volume. Linear growth and organization of volume capacity is generally due to prior RAID subsystem support for only one contiguous VBN-to-DBN mapping range across all disks of a RAID group. Thus if the maximum RAID group size is increased, it is generally not possible to insert disks into the middle of the linear list of RAID groups. Moreover, a prior approach supports only one contiguous DBN-to-VBN mapping range on each disk of the RAID group, typically because the entire DBN space on the disk is mapped into the VBN space of the volume. If a disk in an existing RAID group (other than the “last” RAID group) is exchanged for a larger disk, it is generally not possible to make use of the additional space on the larger disk. For example, if a disk of a RAID group failed and was replaced with a larger disk, the additional space on the larger disk could not be used. The larger replacement disk could only use the VBN range that was previously allocated to the failed disk.
These restrictions also make it difficult to dynamically expand the storage space of an existing file system when upgrading from smaller disks of a volume to larger disks and then utilizing the additional capacity for the volume. An example of a prior approach used to upgrade smaller disks of a storage system to larger disks involves creating a new volume with the larger disks and copying the data from the smaller disks to the larger disks in accordance with, e.g., a volume copy operation. Thereafter, the data stored on the smaller disks are deleted and those disks are removed from the storage system. This approach represents a time consuming procedure that involves a period of time during which the data is not accessible by a client.
Often it is desirable to migrate smaller disks of a volume to larger disks in accordance with a synchronous capacity upgrade for a storage system that supports synchronous RAID mirroring. However, a prior approach used to perform such a synchronous capacity upgrade results in system downtime or client reconfiguration. For example, assume that it is desired to “mirror” an existing volume with disks of capacity size X to a new volume with disks of capacity 2X. A prior approach involves creation of an entirely new volume and copying of the data from the old volume to the new volume in accordance with the volume copy operation.
Since a new volume is created, that volume has a name that is different from the original volume. Moreover, the original volume is brought offline for a discrete period of time in order to ensure consistency of information that is written to the new volume during the copy operation. As a result, the file system is aware of the configuration change, as not only is the file service temporarily disrupted but a renaming operation occurs that renames the newly created volume to that of the original volume. In addition, the file system identifier must be changed and other “house keeping” duties must be performed to ensure that the operators/clients are not aware that their data has moved.