As electronically stored data becomes increasingly central to the effective operation of business, government, and academia, systems that ensure that data is safe and instantly available are in demand. The primary method for storing user-accessible data is the hard disk drive. Because hard disk drives are not one hundred percent reliable, systems and methods have been developed to protect against failure. One such approach, redundant arrays of inexpensive (or independent) disks (RAID) configurations, has been used for years to provide protection and high availability of data. RAID configurations include a number of independent hard disk drives and a specialized RAID controller. RAID systems can be configured to provide both data redundancy and performance enhancement, which are accomplished in RAID systems using a number of techniques such as striping and mirroring. Striping interleaves data across multiple storage elements for better performance. Mirroring duplicates data across two or more storage elements for redundancy.
When a hard disk drive in a RAID system fails, the hard disk drive is no longer available for data transfers. Specific RAID architectures, including those that use mirroring, mirroring with striping, and striping with parity, provide data redundancy so that no data is lost. However, the performance and the level of data redundancy of the RAID system are decreased until the failed hard disk drive can be replaced and rebuilt with all of the data. It is desirable that the RAID system remains on-line during this rebuild process. If a hard disk drive fails, RAID arrays currently have the capability of rebuilding a replacement hard disk drive without taking the RAID system offline. During the rebuild process, all read commands directed to the logical block address (LBA) of the hard disk drive being rebuilt must be handled by using the redundant data striped or mirrored on the other hard disk drives in the RAID array.
Hard disk drive rebuilds are executed in different ways with different performance impacts and levels of transparency. The most efficient and reliable methods of hard disk drive rebuilding are designed to be transparent to the user and to have minimal effect on system performance. Such a system is described in U.S. Pat. No. 5,101,492, entitled, “Data Redundancy and Recovery Protection,” the disclosure of which is hereby incorporated by reference. However, this and other conventional methods of hard disk drive rebuild take several hours to complete and, even though transparency is achieved, system performance is degraded. Time is a major consideration when rebuilding a hard disk drive in a RAID system due to the inherent reduced level of data protection that exists until the drive rebuild is complete. Should another hard disk drive in the RAID system fail during the rebuild, permanent loss of data may occur.
Another major consideration for rebuild activity is the availability of data to system requests. In essence, drive rebuild activity must compete for system resources with system access activity. For high-use systems, one solution is to perform hard disk drive rebuilds during non-peak times. Such a solution is described in U.S. Pat. No. 5,822,584, entitled, “User Selectable Priority for Disk Array Background Operations,” the disclosure of which is hereby incorporated by reference. Until the disk is rebuilt, the underlying data in the host system is vulnerable during the high-traffic times that rely most on data reliability and consistency.