1. Field of the Invention
The present invention relates in general to improved data storage systems and in particular to an improved method and system for reading stored data from a data storage system. Still more particularly, the present invention relates to an improved method and system for fetching stored data is response to requested data and any previously fetched data in a cache.
2. Description of the Related Art
As the performance of microprocessor and semiconductor memory technology increases, there is a need for improved data storage systems with comparable performance enhancements. Additionally, in enhancing the performance of data storage systems, there is a need for improved reliability of data stored. In 1988, a paper was published by Patterson, Gibson, Katz, A Case for Redundant Arrays of Inexpensive Disks (RAID), International Conference on Management of Data, pgs. 109-116, June 1988. This paper laid the foundation for the use of redundant arrays of inexpensive disks that would not only improve the data transfer rate and data I/O rate over a comparable single disk access, but would also provide error correction at a lower cost in data storage systems.
RAID includes an array of disks which are typically viewed by a host, such as a computer system, as a single disk. A RAID controller may be a hardware and/or software tool for providing an interface between the host and the array of disks. Preferably, the RAID controller manages the array of disks for storage and retrieval and can view the disks of the RAID separately. The disks included in the array may be any type of data storage systems which can be controlled by the RAID controller when grouped in the array.
The RAID controller is typically configured to access the array of disks as defined by a particular "RAID level." The RAID level specifies how the data is distributed across the disk drives and how error correction is accomplished. In the paper noted above, the authors describe five RAID levels (RAID Level 1-RAID level 5). Since the publication of the paper, additional RAID levels have been designated.
RAID levels are typically distinguished by the benefits included. Three key benefits which may be included in a RAID level are fault tolerance, data availability and high performance. Fault tolerance is typically achieved through an error correction method which ensures that information can be reconstructed in the event of a disk failure. Data availability allows the data array to continue to operate with a failed component. Typically, data availability is achieved through a method of redundancy. Finally, high performance is typically achieved by simultaneous access to multiple disk drives which results in faster I/O and data transfer requests.
Error correction is accomplished, in many RAID levels, by utilizing additional parity data stored with the original data. Parity data may be utilized to recover lost data due to disk failure. Parity data is typically stored on one or more disks dedicated for error correction only, or distributed over all of the disks within an array.
By the method of redundancy, data is stored in multiple disks of the array. Redundancy is a benefit in that redundant data allows the storage system to continue to operate with a failed component while data is being replaced through the error correction method. Additionally, redundant data is more beneficial than back-up data because back-up data is typically outdated when needed whereas redundant data is current when needed.
In many RAID levels, redundancy is incorporated through data interleaving which distributes the data over all of the data disks in the array. Data interleaving is usually in the form of data "striping" in which data to be stored is broken down into blocks called "stripe units" which are then distributed across the array of disks. Stripe units are typically predefined as a bit, byte, block or other unit. Stripe units are further broken into a plurality of sectors where all sectors are an equivalent size. A "stripe" is a group of corresponding stripe units, one stripe unit from each disk in the array. Thus, "stripe size" is equal to the size of a stripe unit times the number of data disks in the array.
In an example, RAID level 5 utilizes data interleaving by striping data across all disks and provides for error correction by distributing parity data across all disks. For each stripe, all stripe units are logically combined with each of the other stripe units to calculate parity for the stripe. Logical combination is typically accomplished by an exclusive or (XOR) of the stripe units. For N physical drives, N-1 of the physical drives will receive a stripe unit for the stripe and the Nth physical drive will receive the parity for the stripe. For each stripe, the physical drive receiving the parity data rotates such that all parity data is not contained on a single disk. I/O request rates for RAID level 5 are high because the distribution of parity data allows the system to perform multiple read and write functions at the same time. RAID level 5 offers high performance, data availability and fault tolerance for the data disks.
Disk arrays are preferably configured to include logical drives which divide the physical drives in the disk array into logical components which may be viewed by the host as separate drives. Each logical drive includes a cross section of each of the physical drives and is assigned a RAID level. For example, a RAID system may include 10 physical drives in the array. The RAID system is accessible by a network of 4 users and it is desirable that each of the users have separate storage on the disk array. Therefore the physical drives will be divided into at least four logical drives where each user has access to one of the logical drives. Logical drive 1 needs to be configured to RAID level 5. Therefore, data will be interleaved across the cross sections of nine of the physical drives utilized by logical drive 1 and parity data will be stored in the cross section of the remaining physical drive.
A host computer may request data from the data storage system. Typically, data requests are divided into read commands where each read command may request a fixed amount of data. Often, read commands request sequential data by a series of read requests from the host computer for sequential portions of the data. Under standard operation, upon receiving a read command, the RAID controller will check the cache for the requested data. If the requested data is available in the cache, a cache hit is issued and the data is supplied to the host computer from the cache. However, if the requested data is not available in the cache, there is a cache miss and a SCSI command is issued to the physical drive to retrieve the requested data into the cache.
Fetching sequential data into the cache command by command is a slow means of reading data. A well known method of minimizing fetches is achieved by prefetching data. By prefetching data, the data being fetched is brought into the cache along with n additional sectors of sequential data. Many fetching routines have been developed utilizing prefetching in an attempt to increase reading speed from a data storage system. For example, in one routine, the prefetch mode is either turned on or off and the variable n is set to the active stripe size. Another routine determines the n sectors to fetch based on a cache hit ratio. For all previous methods, whether or not to fetch and how much to fetch is independent of what the current read command requests.
However, basing prefetch methods on past data and not on the requested data limits the reliability of the prefetched data. If too much data is prefetched, then other data may be prematurely pushed out of the cache. However, if too little data is prefetched, then more fetches may be required to perform a sequence of reads. Additionally, if the current read request is not sequential, unnecessary time may be utilized to prefetch data that will not be necessary for the next request. It should therefore be apparent that an improved method and system is needed for reliably fetching data such that the number of fetches in a sequential fetching sequence is minimized and such that data is prefetched to the cache in response to the current read request. In addition, such a method should constrain the amount of data prefetched for non-sequential fetching.