As the value and use of information continues to increase, individuals and businesses seek additional ways to process and store information. One option available to users is information handling systems. An information handling system generally processes, compiles, stores, and/or communicates information or data for business, personal, or other purposes thereby allowing users to take advantage of the value of the information. Because technology and information handling needs and requirements vary between different users or applications, information handling systems may also vary regarding what information is handled, how the information is handled, how much information is processed, stored, or communicated, and how quickly and efficiently the information may be processed, stored, or communicated. The variations in information handling systems allow for information handling systems to be general or configured for a specific user or specific use such as financial transaction processing, airline reservations, enterprise data storage, or global communications. In addition, information handling systems may include a variety of hardware and software components that may be configured to process, store, and communicate information and may include one or more computer systems, data storage systems, and networking systems.
To provide the data storage demanded by many modern organizations, information technology managers and network administrators often turn to one or more forms of RAID (redundant arrays of inexpensive/independent disks). Typically, the disk drive arrays of a RAID are governed by a RAID controller and associate software. In one aspect, a RAID may provide enhanced input/output (I/O) performance and reliability through the distribution and/or repetition of data across a logical grouping of disk drives.
RAID may be implemented at various levels, with each level employing different redundancy/data-storage schemes. RAID 1 implements disk mirroring, in which a first disk holds stored data, and a second disk holds an exact copy of the data stored on the first disk. If either disk fails no data is lost, because the data on the remaining disk is still available.
In RAID 3, data is striped across multiple disks. In a four disk RAID 3 system, for example, three drives are used to store data and one drive is used to store parity bits that can be used to reconstruct any one of the three data drives. In such systems, a first chunk of data is stored on the first data drive, a second chunk of data is stored on the second data drive, and a third chunk of data is stored on the third data drive. An Exclusive OR (XOR) operation is performed on data stored on the three data drives, and the results of the XOR are stored on the parity drive. If any of the data drives, or the parity drive itself, fails the information stored on the remaining drives can be used to recover the data on the failed drive.
Regardless of the RAID level employed, the RAID controller presents all of the disks under its control to the information handling system as a single logical unit. In some implementations, a RAID disk controller may use one or more hot-spare disk drives to replace a failed disk drive. In such an instance, the data of the failed drive may be reconstructed on the hot-spare disk drive using data from the other drives that are part of the logical unit. The process of reconstructing the data of a failed or replaced drive onto a substitute drive is often referred to as rebuilding the drive. By rebuilding the failed drive, the logical unit may be returned to its redundant state, with the hot-spare disk drive becoming part of the logical unit. In addition, if revertible hot-spare disk drives are supported, when the failed drive is replaced with an operational drive the contents of the hot-spare disk drive may be copied to a new drive, and the hot-spare disk drive returned to “standby” status.
Along with the increase in data storage requirements of enterprises comes a corresponding increase in the size of disk drives and logical units created from disk drives. As a result, the process of rebuilding a RAID logical unit to a hot-spare disk drive and then returning the hot-spare disk drive to its hot-spare status can take significant amounts of time—especially when there is concurrent I/O to the logical units from one or more host systems. The long time required to rebuild a RAID logical unit generally means that the system is operating in a degraded mode, during which the system is exposed to data loss if a second drive in the logical unit fails, or if a media error occurs on one of the peer drives in the logical unit. In addition, the operations required to perform the rebuild of a replacement drive require resources from the RAID controller and can cause a reduction in overall performance.