Modern large scale information systems utilize special storage systems or devices that may include thousand of magnetic/optical disks and other storage media. Usually, the storage system or device is serving several clients/hosts via various types of communication link (LAN, WAN, wireless, direct connection etc.). A host device is any type of electronic device that has a need either to access data stored on the storage system or to store data in the storage system/device.
When a host device (also referred to as host computer) sends data to a storage device, the storage device may store the data on one or more storage media including hard disk drives, optical storage drives, magnetic tape drives, or semiconductor storage devices. In a typical implementation, the storage device is configured as a storage controller that controls a plurality of hard disk drives.
The host reads data from the storage device by requesting the data from the storage device. In response to the request, the storage device retrieves the data from the storage media and communicates the data to the host.
Unfortunately, there is typically a significant delay between when the storage device receives the request and when the storage device is able to communicate the data.
Data pre-fetching techniques are used to reduce drastically the above mentioned delay. The main scheme of these techniques is that the storage device retrieves data in advance from slow storage media and write the data to the cache. The cache is able to receive and communicate data much faster than the electro-mechanical storage media such as magnetic disk, reducing the time required to perform reads. Hence when the storage device receives request from a host for the data, the requests is serviced from the high speed cache instead of the magnetic/optical disk which takes longer time. In this way, data is communicated to the host much faster.
In general, a main problem related with any implementation of pre-fetch methods is that the storage device is unable to decide with 100% certitude which of the pre-fetched data will be actually used by the host. It should be noticed that the system pays a price (both in terms of disk activity and cache space) for pre-fetch, and hence we need to reduce the amount of false pre-fetch decisions.
There is a need to develop pre-fetch methods that are based on intelligent criteria that correctly pre-fetch accurate amounts of data and reduces the number of false pre-fetch operations.