1. Field of the Invention
Various embodiments of the present invention relate to the field of data storage systems. More particularly, various embodiments of the present invention relate generally to the initialization of a redundant array of storage devices with concurrent access to the redundant array of storage devices.
2. Related Art
Secondary data storage is an integral part of most data processing systems. A typical data storage system in the past utilized a single, expensive magnetic disk for storing large amounts of data. This single disk in general is accessed by the Central Processing Unit (CPU) through a separate Direct Memory Access (DMA) controller. The DMA controller then translates and executes the Input/Output (I/O) requests of the CPU. For single disk memory storage systems, the speed of data transfer to and from the single, large disk is much slower than the processing speed of the CPU and acts as a data processing bottleneck.
In response, redundant arrays of independent disks (RAIDs) have evolved from the single disk storage systems in order to match the speed of secondary storage access with the increasingly faster processing speeds of the CPU. To increase system throughput, the RAID architecture of secondary storage allows for the concurrent access of data from multiple disk drives.
The concept for the RAID architecture was first formalized in an article written by some members of the Department of Electrical Engineering and Computer Sciences at the University of California at Berkeley, entitled: “A Case for Redundant Arrays of Inexpensive Disks (RAID),” by D. A. Patterson, G. Gibson, and R. H. Katz, ACM SIGMOD Conference, Chicago, Ill., June 1988, hereinafter referred to as “Patterson et al.” and incorporated herein as background.
Typically, RAID architectures consist of one or more host interface controllers connected to several peripheral interface controllers via a high speed data bus. Each peripheral interface controller is, in turn, connected to several individual disk drives which provide the secondary storage for the connected hosts. Peripheral interface controllers, also referred to as array controllers herein, can be connected to the disk drives via common communication interfaces (e.g., SCSI). Generally, the speed of the data bus is greater than the speed of the interface between the disk drives and the peripheral interface controllers.
In order to reconstruct lost data in a redundancy group due to a failed disk, the system must define a reversible mapping from the data to its redundancy data in the group containing the lost data. Patterson et al. describe in their paper several such mappings. One such mapping is the RAID level 1 mapping that defines mirrored pairs of data. Essentially, identical copies of data exist on both of the mirrored pairs of physical drives. Another such mapping is the RAID level 4 (RAID-4) mapping that defines a group as an arbitrary number of disk drives containing data and a single redundancy disk. The redundancy disk is a separate disk apart from the data disks.
Still another mapping is the RAID level 5 (RAID-5) mapping. The RAID-5 mapping distributes the redundancy data across all the disks in the redundancy group in addition to distributing the data across all disks in a RAID 4 mapping scheme. As such, there is no single or separately dedicated parity disk. This distribution of the redundancy alleviates the dedicated redundancy drive(s) as the bottleneck for overlapping write operations to the array.
A RAID logical array is created by consecutively mapping stripe units from a set of physical devices. For example, Prior Art FIG. 1 illustrates a storage array 100 comprising n-storage devices (device 0, 110; device 1, 120; on up to device n-1, 130; and device n, 140) in a RAID-5 configuration. A stripe (e.g., stripe-0 150 and stripe-1 160) includes its set of consecutive stripe units across the physical devices and its corresponding redundancies. For example, stripe-0 150 consists of stripe units 0 through n-1. Stripe-0 150 also includes redundancy (e.g., parity) in device-n 140. Correspondingly, stripe-1 160 consists of stripe units n through 2n-1. The redundancy parity for stripe-1 160 is located in device n-1 130.
Each stripe can have one or more redundancies. A logical device with r redundancies can have any of its data available even in case of any simultaneous r physical device failures. As such, any failed physical device can be replaced with a functional physical device and the missing data from the failed physical device can be reconstructed.
Normally, the set of consecutive stripe units, starting from the first physical device 110, is initialized before any read or write accesses are performed on the array 100. For example, in normal operation, RAID volumes typically require that the data stored on the storage devices in the array be in a known state. This is to ensure that each redundancy unit on the RAID array 100 is initially consistent with the data contained in the corresponding stripe units. To ensure this, RAID arrays start a lengthy initialization operation right after the creation of the RAID array. Typically, an array of physical devices is initialized by writing known data to the entire volume. This initialization writes data patterns (normally 0s for even parity) on the stripes and their corresponding redundancies.
A specific drawback to the initialization process in the prior art is the significant downtime of the RAID array when undergoing initialization. Essentially, the RAID array cannot be accessed or utilized until this initialization process has been completed. The downtime can be significant and can amount to multiple hours. For example, a RAID array comprising 160 gigabyte drives with 20 megabytes/sec sustained write bandwidth will require at least 8,000 seconds (approximately 2.5 hours) to complete its initialization, during which the volume is unavailable for access. The issue of downtime associated with the initialization process can worsen when the disk controller becomes the bottleneck and the write rate to each drive falls below 20 megabytes/sec.