As processing power has increased and storage costs have fallen, applications have emerged to leverage this capacity to rapidly manipulate large quantities of data. For example, video editing applications may store read, write, and edit video files of several gigabytes or more. In fact, terabyte-sized datasets are not uncommon and may be used to manage organizations, to monitor logistics, to track and analyze customer behavior, and to perform scientific analysis ranging from physics simulations to oil and gas exploration. To support these applications and others, a storage architecture may incorporate Network Attached Storage (NAS) devices, Storage Area Network (SAN) devices, and other configurations of storage elements and controllers that interface with any number and manner of storage devices.
In many cases, it is the storage devices themselves that are the limiting factor in performance. Magnetic hard disk drives (HDD) have high capacities and are affordable, but often have latencies that are an order of magnitude greater than the next fastest technology. The nature of rotating platters and seeking heads means that random read/write performance is particularly slow. To improve storage performance in light of these limitations, the HDDs may be incorporated into a heterogeneous collection of storage devices arranged in a cache hierarchy. A cache hierarchy dynamically maps subsets of the address space to smaller pools of faster devices, so that data may be read from and written to the faster devices as frequently as possible.
The caches may be loaded with data when transactions request it and may also be loaded by prefetch operations that attempt to predict what data will be used next. A number of techniques exist for determining which data to load into which particular cache. However, when a workload is variable or random, conventional caching algorithms may not make accurate prefetch predictions, which hurts cache efficiency and performance. Accordingly, while conventional prefetching techniques have been generally adequate, an efficient system and method for improved prefetching has the potential to dramatically improve cache hit rate and system performance.