1. Field of the Invention
This invention relates to data storage subsystems and, more particularly, to a disk striping method operable within a storage subsystem for improving both performance and storage capacity utilization.
2. Discussion of Related Art
Storage systems provide persistent storage of information for computer systems. These storage systems often integrate a number of data storage devices (e.g., tape drives, compact disks and hard disk drives) to store data and make it accessible to a computer system. Storage capacity requirements and performance requirements are ever increasing. For example, so-called multi-media computer applications require storage of vast quantities of audio and video data. Further, such applications often require high performance levels for the storage subsystems. For example, the capture of video data for storage on a disk storage subsystem often requires very high data transfer rates to store the captured data and requires that such high data transfer rates be sustained for extended periods of time. Where the volume of such data increases and reliance on computer systems to store and manipulate such data increases, disk performance and disk utilization are constant and continuing challenges.
A recording surface of a hard disk drive (disk) consists of a series of concentric tracks on a recording surface. The medium for recording the information is typically magnetic or optical. Modem disks typically have multiple such surfaces upon which data can be written or read. Each surface is served by an electronic read/write head and associated read/write channel. The read/write head and channel circuits encodes data, writes the encoded information by storing signals on the recording surface, and reads previously stored information. identically positioned tracks (radially speaking) on multiple surfaces accessed by multiple read/write (r/w) heads are often referred to as cylinders.
Each track is subdivided into a plurality of equally, generally fixed, sized sectors. Each sector is capable of storing a block of data for subsequent retrieval. In a concentric track layout, the radially outermost track/cylinder has a larger circumference as compared to radially inner tracks. The outermost track is therefore potentially capable of storing the data as compared to the inner tracks. In other words, the larger circumference of outer tracks typically comprises more fixed sized sectors than do inner tracks.
Computers accessing a storage subsystem usually access the storage in fixed sized units. A sector or block is generally the smallest such unit of access by a computer. However, for higher performance applications, it is common to access the storage devices in much larger, yet equal and generally fixed sized, groupings of blocks. High performance application access larger quanta of storage space in each I/O read or write request so as to amortize the I/O processing overhead over a larger number of data units transferred. This helps achieve higher overall throughput by reducing the amount of overhead processing as compared actual data transfer processing. For example, high performance applications might assure that each I/O request that accesses the storage subsystem is at least a full track or a full cylinder, or multiples of tracks or cylinders.
As noted above, inner portions of the disk may provide slower performance than outer portions due to physical geometry and storage density of the disks. Outer portions (tracks and cylinders) of disk drives store more sectors than inner portions due to their larger circumference. Disk drive manufacturers generally group tracks/cylinders having the same number of sectors into zones. A zone is therefore a grouping of like sized tracks/cylinders in a disk drive. Outer zones therefore have more sectors per track (SPT) as compared to inner zones. The number of tracks/cylinders in each zone is dependent upon a number of design factors considered by disk drive manufacturers. A disk manufacturer therefore provides a disk layout that identifies the specific zone configuration of a particular disk.
Outer zones generally provide better performance than inner zones for two reasons. First, since tracks/cylinders in the outer zones have more sectors per track as compared to those in inner zones, more data is transferred from a single rotation of the disk surface. Tracks in the outer zones therefore have an inherently higher transfer rate over a single rotation of the disk recording surface. Second, since each outer zone track/cylinder stores more data than inner zone tracks/cylinders, the read/write head associated with each surface of the disk need not be moved as frequently to sustain a particular data transfer rate. As a large data transfer proceeds from track to track (cylinder to cylinder) over the surface of a disk, the disk read/write head is moved to re-position over each subsequent track/cylinder of a disk. Large transfers to or from tracks in outer zones can therefore sustain a higher transfer rate than transfer to or from tracks in inner zones. The zone's performance is therefore a measure of its sustainable transfer rate which is in part, a function of it radial placement on the disk drives (i.e., inner zones provide lower sustained performance than outer zones).
Disk striping is a technique used to enhance disk transfer performance. Striping distributes data in stripes across a plurality of disks. On each disk, data is mapped and stored in predefined blocks generally having a fixed size. A predefined number of blocks of data from each disk are mapped to define a stripe of data. Stripes are allocated across portions of each of the disk drives used for the striping. Each such portion is referred to herein as a segment. For example, a stripe will consist of N segments, where N is the number of disk drives used for striping. Striping improves overall performance of a storage subsystem by defining the stripe as parallel blocks of data across the disks. The total time required to process a large I/O transfer is then divided up (approximately) by the number of drives used for the stripe. For example, a transfer of X blocks to a single disk drive may require T seconds. The same transfer of X blocks striped over N drives would therefore complete in (approximately) X/N seconds.
Disk striping is particularly effective to improve storage subsystem performance for applications depending on large I/O transfers. For example, applications such as real time image capture of other high speed data acquisition require high data transfer capacity. Striping improves performance by spreading large I/O operations over a plurality of disk drives. Rather than waiting for a single disk drive to process all data in a large I/O request, each of a plurality of disks processes a smaller portion of the data in the large I/O request in parallel with other disk drives.
Striping is a common technique in, for example, RAID storage subsystems. Data in many RAID (Redundant Array of Inexpensive Disks) is striped over a number of disk drives--the disk array. Although striping of multiple disks enhances performance, it diminishes reliability. Striping diminishes reliability because failure of any one of the disks is equivalent to the failure of the entire array. RAID system are designed to enhance reliability, especially that which is lost by use of striping techniques. RAID systems employ many techniques to enhance reliability of stored information. One such technique is the use of mirroring to enhance the reliability of disk storage devices. Other RAID techniques use parity (i.e., Boolean exclusive-OR computed values) or other techniques to provide redundancy information capable of regenerating data lost due to a disk drive failure. In general, RAID systems use striping to increase performance and use redundancy (mirroring as well as other redundancy techniques) to assure reliable storage of data on the disk drives.
As noted above, sustained performance of an individual disk may vary in accordance with the location of the zones used for large transfers. In like manner, disk array performance (i.e., RAID subsystem performance) may vary depending upon the location of zones used for a particular transfer. Disk arrays are often rated for performance in accordance with their maximum performance (as experienced in outer zones) or their average performance which is an average sustained transfer rate over all zones. This average performance includes the minimum performance (as experienced in inner zones). However, many applications require that the array performance be maintained above a minimum level to sustain anticipated data transfers of the application. For example, real time image capture or high speed data acquisition must store captured data as quickly as it is generated or risk loss of data.
A typical solution in accordance with present techniques is to design or specify the storage subsystem for an application based upon the minimum performance of the subsystem rather than the average sustained rate. Therefore, many applications must use the minimum sustained performance of a disk array. Such applications may be wasting the higher performance capability of the array where outer zones happen to be in use but must do so to sustain a minimum performance level experienced within inner zones.
The performance of the inner zones, the minimum performance level, is therefore critical to many applications. Using a larger number of disks in the array, thereby spreading the performance bandwidth over still more parallel operating disks can increase the minimum performance level. However, such solutions may be wasteful of storage capacity where the size of individual disk drives cannot be effectively tailored to meet the requirements of the application. In other words, adding more disk drives merely to enhance performance may add more capacity than is required for the application.
It is evident from the foregoing description that a need exists for a disk striping method to improve both performance and storage capacity utilization of storage subsystems.