A disk array is a collection of hard disk drives (HDDs) combined with array management software which controls the operation of the HDDS and presents them as one or more virtual HDDs to the host operating environment. The array management software masks the internal complexity of the disk array from the host operating environment by mapping virtual disk block addresses to member disk block addresses, so that the I/O operations are properly targeted to the physical storage. Arrays of HDDs are used to better match the I/O needs of a host computer with the performance limitations of disk drives. Using multiple storage devices to communicate data to the host system improves I/O performance for the host computer.
A Redundant Array of Independent Disks (RAID) system refers to a storage array where redundancy is provided. That is, part of the storage capacity is used to store redundant information about user's data stored on the remainder of the storage capacity. The redundant information enables regeneration of user data in the event that one of the arrays member disks or access path fails. Six levels of RAID systems are currently defined in the RAID book, "A Source for Disk Array Technology", Fourth Edition, The RAID Advisory Board.
RAID level 1 refers to data mirroring. In RAID level 2, a block of data is broken up and striped across a set of disk drives on a bit level and ECC codes for reconstructing each data block are stored on a separate set of disk drives. In order to access a block of data, all of the drives are accessed together. In RAID level 3, a block of data is also broken up and striped across a set of drives. Parity data for reconstructing each block is stored on a separate disk drive. In RAID level 4, a set of data blocks is striped across a set of drives, with parity data for the set of blocks (used for reconstructing one of the blocks of the set) stored on a separate disk drive. In a RAID level 4 system, each block can be accessed from a single drive. A RAID level 5 system is similar to a RAID level 4 system except that no one disk drive stores all of the parity. In a RAID level 6 system two blocks of parity are used for each set of data blocks such that two blocks of data can be reconstructed per set. RAID level 0, refers to a disk array where data is striped, but, redundancy is not used.
The host system executes application programs and device driver software which communicates with the storage devices. The application programs issue requests to access data stored on the storage devices which are routed through an I/O bus adaptor. The host system contains memory where data is read from and written to. An I/O bus adaptor provides an interface between the I/O bus and the host computer memory. The adaptor accepts commands from an I/O driver and relays the commands to the addressed storage devices. The I/O bus itself is the medium in which host commands, disk responses, and data are moved between bus adaptors and the storage devices. The storage devices provide block addressable random read/write access to data storage. An example of an I/O bus is the small computer storage interconnect (SCSI).
A disk controller connects a host computer's I/O bus to the storage device I/O bus using an I/O bus adaptor, such as a channel. The controller connects the host to additional storage devices and provides for greater I/O capacity. Each of the controller channels is an independent path for data which extend the subsystem's data transfer capacity.
An array stripe depth refers to the number of contiguously mapped virtual disk blocks in the array. The size of the stripe depth depends on the type of application requests that are made.
For application requests which specify large amount of data a significant portion of the I/O request execution time consists of the data transfer. If such requests are made to a virtual disk where the data is striped across each of the disks, most application I/O requests to the virtual disk will result in split I/O requests. For example, if there are four member disks in an array and requested data is mapped evenly across the three member data disks, each disk can independently be transferring its portion of the requested data. In that way, the I/O load is split among the disk drives.
Alternatively there can be request intensive application programs where a large number of small I/O requests are made. For such applications it is advantageous not to split the I/O request but to store all of the requested data on a single drive. For data transfer intensive application, such as image processing, the stripe depth should ideally be set so that the average I/O request is split across all members of the array on which the data is stored. For I/O request intensive applications, such as transaction processing, the stripe depth should be set so that the average I/O request has a small probability of being split across multiple array members.
An example of an I/O intensive application benefiting from striping data across a set of storage devices is a video server system. Computers are used to compress and store video material in digital format. This enables video on demand over telephone lines and pay-per-view movies in hotels. Compression technology enables the compression ratios which allow for the storage of videos on disk technology. Disk technology allow for random access of digital data and in an array subsystem provide high bandwidth. One such subsystem is described in commonly owned patent application Ser. No. 08/302,625, Belnapp et al., filed Sep. 8, 1994 hereby incorporated by reference.
The video server described in the application Ser. No. 08/302,065 provides a technique for serving many simultaneous streams from a single copy of data using data striping techniques like those used in RAID systems. Data striping involves the concept of a logical file whose data is partitioned to reside in multiple file components called stripes. Each stripe is stored on a different disk volume, thereby allowing the logical file to span multiple physical disks. The disk may be either local or remote. When the data is written to a logical file it is separated into logical lengths called segments that are placed sequentially into the stripes.
For example, a logical file for a video can be segmented into M segments or blocks, each of a specific size, e.g. 256 kilobytes. The last segment may only be partially filled with data. A segment of data is placed in the first stripe followed by the next segment that is placed in the second stripe, etc. When a segment has been written to each of the stripes the next segment is again written to the first stripe. Thus, if a file of M segments is being stripped into N stripes then stripe 1 will contain the segments 1, N+1, 2.times.N+1, etc., stripe 2 will contain the segments 2, N+2, 2.times.N+2, etc., until all M segments are stored in one of the N striped files.
In RAID systems the purpose of striping is to assure data integrity in case a disk is lost. A RAID system dedicates at least one of N disks to the storage of parity data that is used when data recovery is required. In the video server system the disk storage nodes are organized as RAID like structure, but parity data is not always required since a copy of the date video data can be available from a tape storage.
In a video server, striping is used for concurrency and bandwidth reasons. Each video presentation is separated into data blocks or segments that are spread across available disk drives to enable each video presentation to be accessed simultaneously from multiple disks without requiring multiple copies.
In a stream optimized video server using data striping for increased system capacity, both storage capacity and the number of streams is effected by adding additional disk drives. When disk drives are added, the data on existing drives needs to be restriped across the existing and newly added drives to enable the use of the additional capacity provided by the addition of the disk drives.
There is also a need to be able to simultaneously support objects striped over a variable number of disk drives. So that existing objects can be accessed while new objects are striped over a larger number of drives including the drives that the existing objects are using.
It is desirable that such a system continue to operate while data re-striping is taking place. This is true of systems other than video streamers using a storage array where additional storage capacity is being added.