In networked computing systems, one computing system may make a request of another computing system. The time it takes from the time a computing system dispatches the request to the time that the computing system receives the response is referred to as “latency”. The lower the latency, the greater the responsiveness.
Caching is an approach for improving responsiveness for a subset of such requests by servicing some requests using historical information. For example, a web browser may record pages previously visited. If the user browses to that web page while the page is still cached, the page may perhaps be drawn from the local cache, rather than having to re-request the page over a network. Such caches are in widespread use in modern networking systems. However, the efficacy of caches is primarily determined by a backward-looking model in which requests may be satisfied by information cached in response to prior requests.
Web browsers have recently introduced a non-standard “link prefetching” mechanism for speculatively retrieving resources in anticipation of future user requests. However, this form of precaching relies on the client to initiate the request and can only be performed for a constrained set of requests. For example, the client can only choose to pre-cache a fixed set of resources indicated in the response to the original request and must decide without further assistance whether speculatively performing these requests is safe (requesting a resource may for example undesirably change server state).