Computer data storage devices, such as disk drives and Redundant Array of Independent Disks (RAID), typically use a cache memory in combination with mass storage media (e.g., magnetic tape or disk) to save and retrieve data in response to requests from a host device. Cache memory, often referred to simply as “cache”, offers improved performance over implementations without cache. Cache typically includes one or more integrated circuit memory device(s), which provide a very high data rate in comparison to the data rate of non-cache mass storage medium. Due to unit cost and space considerations, cache memory is usually limited to a relatively small fraction of (e.g., 256 kilobytes in a single disk drive) mass storage medium capacity (e.g., 256 Gigabytes). As a result, the limited cache memory should be used as efficiently and effectively as possible.
Cache is typically used to temporarily store data prior to transferring the data to an ultimate destination. For example, a read cache is often used to temporarily store data that is destined for the host device. In addition, a write cache is typically used to temporarily store data from the host device that is destined for the mass storage medium. Thus, cache is typically divided into a read cache portion and a write cache portion. Data in cache is typically processed on a page basis. The size of a page is generally fixed in any particular implementation; a typical page size is 64 kilobytes.
Generally, storage device performance improves as read cache hit rate goes up. A read cache hit is an event in which requested data is available in the read cache to satisfy the request. Read cache hit rate is a measure of frequency of accessing the read cache rather than another type of memory, such as mass media (e.g., a disk). As is generally understood, the mass media typically takes much longer to access than the read cache. Thus, by increasing the read cache hit rate, data input/output (I/O) rate to the host can be increased. In order to take advantage of the relatively faster read cache, typical storage devices attempt to predict what data a host device will request in the near future and pre-fetch that data; that is, read the data from the mass media and store the data in the read cache so that that data is available in the read cache when the host actually requests it. Pre-fetching has been found to increase the likelihood of a read cache hit.
One way to decide whether to pre-fetch data is by identifying “sequential workloads” during operation. A sequential workload is generally a host workload that includes request(s) for data at logical addresses that are substantially sequential. After detecting a sequential workload, the storage device can pre-fetch data in the detected sequence and store that data in the read cache. In a traditional system, after data is pre-fetched, the system must continue to attempt to detect sequential workloads, prior to performing subsequent pre-fetches. Thus, any benefits that might be gained as a result of subsequent pre-fetches typically are contingent upon detecting subsequent sequential workloads.
Unfortunately, detecting sequential workloads can be time and resource consuming. A typical process for detecting a sequential workload involves storing a number of host requests in memory, sorting addresses associated with the stored host requests, often in numerical order, and then attempting to identify a sequential pattern in the sorted addresses. Identifying a sequential pattern often involves use of a resource-expensive sequential pattern recognition algorithm based on sorted addresses. The memory required to separately store host requests, and the processor time and memory required to sort and identify pattern(s) in the request addresses can result in inefficient use of storage device resources. Any resource (e.g., memory or processor time) that is used to detect a sequential workload, therefore, may not be available for host I/O.
Unfortunately, the sequential workload detection process is typically repeatedly executed in a storage device as host read requests arrive prior to pre-fetching data. As a result, the inefficiencies of the sequential detection process discussed above are often compounded when determining whether to pre-fetch data. Thus, traditional methods of determining when to pre-fetch data typically utilize storage device resources inefficiently.