Redundant Arrays of Inexpensive Disks (RAID) have become effective tools for maintaining data within current computer system architectures. A RAID system utilizes an array of small, inexpensive hard disks capable of replicating or sharing data among the various drives. A detailed description of the different RAID levels is disclosed by Patterson, et al. in “A Case for Redundant Arrays of Inexpensive Disks (RAID),” ACM SIGMOD Conference, June 1988. This article is incorporated by reference herein.
Several different levels of RAID implementation exist. The simplest array, RAID level 1, comprises one or more disks for data storage and an equal number of additional “mirror” disks for storing a copy of all the information contained on the data disks. The remaining RAID levels 2, 3, 4, 5 and 6, all divide contiguous data into pieces for storage across the various disks.
RAID level 2, 3, 4, 5 or 6 systems distribute this data across the various disks in blocks. A block is composed of many consecutive sectors where a sector is a physical section of a disk drive including a collection of bytes. A sector is the disk drive's minimal unit of data transfer. When a data block is written to a disk, it is assigned a Disk Block Number (DBN). All RAID disks maintain the same DBN system so one block on each disk will have a given DBN. A collection of blocks on the various disks having the same DBN are collectively known as stripes.
Additionally, many of today's operating systems manage the allocation of space on mass storage devices by partitioning this space into volumes. The term volume refers to a logical grouping of physical storage space elements which are spread across multiple disks and associated disk drives, as in a RAID system. Volumes are part of an abstraction which permits a logical view of storage as opposed to a physical view of storage. As such, most operating systems see volumes as if they were independent disk drives. Volumes are created and maintained by Volume Management Software. A volume group is a collection of distinct volumes that comprise a common set of drives.
One of the major advantages of a RAID system is its ability to reconstruct data from a failed component disk from information contained on the remaining operational disks. In RAID levels 3, 4, 5, 6, redundancy is achieved by the use of parity blocks. The data contained in a parity block of a given stripe is the result of a calculation carried out each time a write occurs to a data block in that stripe. The following equation is commonly used to calculate the next state of a given parity block:new parity block=(old data clock xor new data block) xor old parity blockThe storage location of this parity block varies between RAID levels. RAID levels 3 and 4 utilize a specific disk dedicated solely to the storage of parity blocks. RAID levels 5 and 6 interleave the parity blocks across all of the various disks. RAID 6 distinguishes itself as having two parity blocks per stripe, thus accounting for the simultaneous disconnection of two disks. If a given disk in the array is disconnected, the data blocks and the associated parity block for a given stripe from the remaining disks can be combined to reconstruct the missing data.
One mechanism for dealing with the disconnection of a single disk in a RAID system is the integration of a global hot spare disk. A global hot spare disk is a disk or group of disks used to replace a disconnected primary disk in a RAID configuration. The equipment is powered on or considered “hot,” but is not actively functioning in the system. When a disk in a RAID system is disconnected, the global hot spare disk integrates for the disconnected disk and reconstructs all the volume pieces of the missing disk using the data blocks and parity blocks from the remaining operational disks. Once this data is reconstructed the global hot spare disk functions as a component disk of the RAID system until reconnection of the disconnected RAID disk. When the disconnected primary disk is reconnected, a copyback of the reconstructed data from the global hot spare to the reconnected primary disk may occur.
Currently, when a component disk is disconnected in a non-RAID 0 system, the global hot spare disk integrates for the disconnected disk and reconstructs all volume pieces from the disconnected disk. This approach needlessly reconstructs and copies back volume pieces belonging to volumes which were not accessed or modified (i.e. those which did not receive an I/O request) in the time between the disconnection of the RAID component disk and its reconnection.
Therefore, it would be desirable to provide a system and a method for reconstruction and copyback of only those volume pieces on a disconnected disk which were part of the volumes receiving an I/O request in the time between the disconnection and reconnection of a RAID disk.