As an increasing amount of information processing is being performed electronically, and as the speed of that processing increases, there is a corresponding demand to improve the performance of the systems and services that manage the information being processed. A substantial amount of data is still stored on magnetic disk drives and similar devices, due at least in part to the relative low cost of these devices, but the time needed to read data to, and write data from, these devices is a significant source of latency. One approach to reducing latency is to pre-fetch data from these devices and store the information in faster memory (e.g., solid state memory), but conventional computing systems utilizing locally attached disks typically must be conservative in issuing pre-read commands as a read command utilizes bus bandwidth, and there is a limited amount of solid state memory available to hold the pre-fetched data. Further, there are no existing approaches to accurately predict which data should be pre-fetched, which results in either a smaller amount of data being pre-fetched than is needed, which results in lesser latency improvements, or a significant amount of data being pre-fetched that is not needed, which can result in a significant drain on available network resources.
Even where conventional systems look to recent read requests to determine data to be pre-fetched, these systems are typically limited in the amount of data they can access for the determinations. For example, an operating system (OS) on a computer might look at certain processes, and/or a RAID controller might look at data at another level independent of the OS. A disk drive might also have some amount of cache and do some level of pre-fetching as well. At each level, a decision can be made based on read requests received in the recent past, which can be limited in scope. For example, the disk drive will have the least amount of information and the most narrow view, but will not affect any other devices as the drive is only occupying its own cache and spindle time. Pre-fetching reads at the RAID level also can reduce available RAID cache, and moving up to the OS level can provide a broader view but can end up also filling OS memory. Thus, as the amount of information increases the potential for negatively impacting performance of the network increases as well. Further, such information typically is not persisted such that each time a device is rebooted, a significant amount of time passes between reads, or any other such action occurs, the historical information is lost and the prediction has to start from scratch.