A storage system typically comprises one or more storage devices into which data may be entered, and from which data may be obtained, as desired. The storage system includes a storage operating system that functionally organizes the system by, inter alia, invoking storage operations in support of a storage service implemented by the system. The storage system may be implemented in accordance with a variety of storage architectures including, but not limited to, a network-attached storage environment, a storage area network and a disk assembly directly attached to a client or host computer. The storage devices are typically disk drives organized as a disk array, wherein the term “disk” commonly describes a self-contained rotating magnetic media storage device. The term disk in this context is synonymous with a hard disk drive (HDD), a direct access storage device (DASD) or a logical unit number (lun) in a storage device.
Storage of information on the disk array is preferably implemented as one or more storage “volumes”, defining an overall logical arrangement of disk space. The disks within a volume are typically organized as one or more groups, wherein each group is operated as a Redundant Array of Independent (or Inexpensive) Disks (RAID). Most RAID implementations enhance the reliability/integrity of data storage through the redundant writing of data “stripes” across a given number of physical disks in the RAID group, and the appropriate storing of redundant information with respect to the striped data. The redundant information may thereafter be retrieved to enable recovery of data lost when a storage device fails.
In the operation of a disk array, it is anticipated that a disk can fail. A goal of a high performance storage system is to make the mean time to data loss as long as possible, preferably much longer than the expected service life of the system. Data can be lost when one or more disks fail, making it impossible to recover data from the device. Typical schemes to avoid loss of data include mirroring, backup and parity protection. Mirroring stores the same data on two or more disks so that if one disk fails, the “mirror” disk(s) can be used to serve (e.g., read) data. Backup periodically copies data on one disk to another disk. Parity schemes are common because they provide a redundant encoding of the data that allows for loss of one or more disks without the loss of data, while requiring a minimal number of disk drives in the storage system.
Parity protection is often used in computer systems to protect against loss of data on a storage device, such as a disk. A parity value may be computed by summing (usually modulo 2) data of a particular word size (usually one bit) across a number of similar disks holding different data and then storing the results on the disk(s). That is, parity may be computed on 1-bit wide vectors, composed of bits in predetermined positions on each of the disks. Addition and subtraction on 1-bit vectors are an equivalent to exclusive-OR (XOR) logical operations; these addition and subtraction operations can thus be replaced by XOR operations. The data is then protected against the loss of any one of the disks, or of any portion of the data on any one of the disks. If the disk storing the parity is lost, the parity can be regenerated from the data. If one of the data disks is lost, the data can be regenerated by adding the contents of the surviving data disks together and then subtracting the result from the stored parity.
Typically, the disks are divided into parity groups, a common arrangement of which comprises one or more data disks and a parity disk. The disk space is divided into stripes, with each stripe containing one block from each disk. The blocks of a stripe are usually at equivalent locations on each disk in the parity group. Within a stripe, all but one block contain data (“data blocks”) with the one block containing parity (“parity block”) computed by the XOR of all the data. If the parity blocks are all stored on one disk, thereby providing a single disk that contains all (and only) parity information, a RAID-4 level implementation is provided. If the parity blocks are contained within different disks in each stripe, usually in a rotating pattern, then the implementation is RAID-5. The term “RAID” and its various implementations are well-known and disclosed in A Case for Redundant Arrays of Inexpensive Disks (RAID), by D. A. Patterson, G. A. Gibson and R. H. Katz, Proceedings of the International Conference on Management of Data (SIGMOD), June 1988.
Often other types of parity groupings are supported by a storage system. For example, a RAID-0 level implementation has a minimum of one data disk per parity group. However, a RAID 0 group provides no parity protection against disk failures, so loss of a single disk translates into loss of data in that group. A row-diagonal parity implementation has two parity disks per group for a minimum of three disks per group, i.e., one data and two parity disks. An example of a row-diagonal (RD) parity implementation is described in U.S. Pat. No. 6,993,701, issued on Jan. 31, 2006 titled, Row-Diagonal Parity Technique for Enabling Efficient Recovery from Double Failures in a Storage Array and filed Dec. 28, 2001. A RD parity group can survive the loss of up to two disks in the RAID group.
The storage operating system of the storage system typically includes a RAID subsystem that manages the storage and retrieval of information to and from the disks in accordance with input/output (I/O) operations. In addition, the storage operating system includes administrative interfaces, such as a user interface, that enable operators (system administrators) to access the system in order to implement, e.g., configuration management decisions. Configuration management in the RAID subsystem generally involves a defined set of modifications to the topology or attributes associated with a storage array, such as a disk, a RAID group, a volume or set of volumes. Examples of these modifications include, but are not limited to, disk failure handling, volume splitting, volume online/offline, changes to (default) RAID group size or checksum mechanism and disk addition.
Typically, the configuration decisions are rendered through a user interface oriented towards operators that are knowledgeable about the underlying physical aspects of the system. That is, the interface is often adapted towards physical disk structures and management that the operators may manipulate in order to present a view of the storage system on behalf of a client. For example, in the case of adding disks to a volume, an operator may be prompted to specify (i) exactly which disks are to be added to a specified volume, or (ii) a count of the number of disks to add, leaving the responsibility for selecting disks up to the storage operating system.
A prior approach to selection of disks involves interrogation of all disks coupled to the storage system using the storage operating system. Broadly stated, the operating system issues a broadcast message to which each disk responds with its name, its location and its attributes, such as the size of the disk and supported checksum style and sector size. An ordered list of disk is then created based on the sequence in which the disks respond. Disks are thereafter allocated for disk selection in the order defined by the list, e.g., from top to bottom of a disk shelf. Moreover, selection of a disk is based only on size, checksum style and format block size considerations, without regard to physical locality of the disk for, e.g., fault isolation.
However, it may be desirable for the storage operating system to factor other issues into the selection of disks, based upon the disk attributes of sector size, selected checksum algorithm and disk size. For example, a mirrored volume requires the balanced addition of disks to each of the N-plexes of the mirror. The same number of disks, with the same sector size, selected checksum algorithm and disk size, must be added to each mirror plex simultaneously.