Advances in semiconductor technology have delivered processors with more computing power than that of a mainframe. While processing speed has increased tremendously, the input/output (I/O) speed of secondary storage devices such as disk drives has not kept pace. As the processing throughput of the system depends in part on the slowest component, the bottleneck associated with an unduly slow storage system may neutralize the speed advantages of a fast host processor. Additionally, the use of multiple applications may further accentuate the imbalance between the host computer and the peripheral I/O performance. Thus, a high performance disk drive system has become requisite in a modern computer. In order to address the performance requirement, a random array of independent disks (RAID) is used to store data on several disks concurrently.
Typically, disk I/O performance is dominated by the time mechanical parts of the disk move to a location where the data is stored. After a disk controller receives an instruction from a consumer or an application, the disk controller causes a disk drive motor to actuate disk heads to move to the appropriate location and retrieves the requested data. The time required to position the disk head over the recording surface of a disk is known as a seek time. Seek times for random disk accesses are, on the average, orders of magnitude longer than the data transfer times if a semiconductor memory device were accessed. Additionally, because the disk drives have spinning magnetic media platters, a rotational latency while the platter spins to get the data in place for reading is also introduced. These rotational latencies are also orders of magnitude greater than the data transfer times of the semiconductor memory devices. For example, in an enterprise level disk drive performing a track read and a ⅓ stroke seek followed by a track read, the relative time required to read the data on a first track and a second track is to perform a seek across ⅓ of the disk surface. A seek settle time is the amount of time required to move the head from an initial track, the first track, to a target track, such as the second track, and stop the head from moving across the track. In the best performance 3.5″ disk drives available today, the seek settle time can be 3.5 mS, while a single track of data can be read in about 139 μS at 7200 RPM or 100 μS at 10,000 RPM. This demonstrates the dramatic reduction in data performance whenever the head must be relocated.
To minimize the seek and rotational time delays, disk systems incorporate RAID controller based disk caches which take advantage of the principle of locality of references well known in the computer programming art. Typically, the data from the disk is buffered by a large semiconductor memory within the RAID controller that has a relatively fast access time. If the data requested by the application already resides in the cache memory, the RAID controller can transfer the data directly from the cache memory to the requesting application. Performance is increased because accessing data from the cache memory is substantially faster than accessing data from the disk drive.
Although often quite effective, such a cache can experience a performance degradation caused in part by the sensitivity of the disk cache to cache hit statistics. A disk cache system having a low hit rate may perform more poorly than an uncached disk due to caching overhead and queuing delays, among others.
One factor affecting the cache performance is the size of the disk cache. With a limited cache memory, a multitude of requests over a variety of data segments can easily exhaust the capability of the disk cache system to retain the desirable data in the cache memory. Often, data that may be reused in the near future is flushed prematurely to make room in the cache memory for handling new requests from the host computer, leading to an increase in the number of disk accesses to fill the cache. The increase in disk activity, also known as thrashing, institutes a self-defeating cycle in which feeding the cache with data previously flushed disproportionately impacts the disk drive utilization.
A related factor affecting the hit rate is the cache memory block size allocation. An allocation of a relatively large block of memory reduces the quantity of individually allocatable memory blocks. In systems having multiple concurrent tasks and processes that require access to a large number of data files, a reduction in the number of individually allocatable blocks increases the probability of the rate of cache block depletion, once more leading to disk thrashing which decreases the overall disk system throughput.
Another factor affecting the performance of the disk cache is the read-ahead policy for prefetching data into the cache. Prefetching data into the cache enhances performance when the application, or consumer, issues sequential data requests. However, in the event that the data is accessed in a random manner, the prefetching policy may be ineffective as data brought into the cache is not likely to be used again soon.
Additionally, the prefetching policy may cause a bottleneck on the disk data path, as each attempt to prefetch data from the disk into the cache memory potentially creates a contention for the data path between the disk drive and the application. Thus, an automatic prefetch of data in a system with a large percentage of random I/O operations may degrade the overall system performance. As a result, the prefetching of data into the cache memory must be judiciously utilized to minimize the data path contention and the overhead associated with loading data into the cache.
Thus, a RAID controller system is needed to minimize the seek and rotational latency and low data transfer rates commonly associated with disk accesses. Further, it is desirable that the read ahead disk cache minimizes the loss of performance which occurs when random accesses occur frequently.
In view of the ever-increasing demand for applications that require access to very large data files, such as video on demand, it is increasingly critical that answers be found to these problems. Solutions to these problems have been long sought but prior developments have not taught or suggested any solutions and, thus, solutions to these problems have long eluded those skilled in the art.