The hypertext transfer protocol is a standard for the world wide web and is referred to as HTTP by those skilled in the art of using the global communications system known as the Internet. HTTP is typically used for distributed information systems, where performance can be improved by the use of response caches. The HTTP protocol includes a number of elements intended to make caching work as well as possible. Because these elements are inextricable from other aspects of the protocol, and because they interact with each other, it is useful to describe the basic caching design of HTTP separately from the detailed descriptions of methods, headers, response codes, etc.
Caching would be useless if it did not significantly improve performance. The goal of caching in HTTP is to eliminate the need to send requests in many cases, and to eliminate the need to send full responses in many other cases. The former reduces the number of network round-trips required for many operations; an “expiration” mechanism is defined for this purpose. The latter reduces network bandwidth requirements; a “validation” mechanism is defined for this purpose.
The basic cache mechanisms in HTTP (server-specified expiration times and validators) are implicit directives to caches. In some cases, a server or client might need to provide explicit directives to the HTTP caches using the Cache-Control header for this purpose. The Cache-Control header allows a client or server to transmit a variety of directives in either requests or responses. These directives typically override the default caching algorithms.
HTTP caching works best when caches can entirely avoid making requests to the origin server. The primary mechanism for avoiding requests is for an origin server to provide an explicit expiration time in the future, indicating that a response MAY be used to satisfy subsequent requests. In other words, a cache can quickly return a fresh response.
Server administrators may assign future explicit expiration times to responses in the belief that the entity is not likely to change, in a semantically significant way, before the expiration time is reached. This normally preserves semantic transparency, as long as the server's expiration times are carefully chosen.
When a hypertext document such as a web page is requested via the Hypertext Transfer Protocol (HTTP) a server 110 in FIG. 1 locates a file based on the requested Uniform Resource Locator (URL). This file may be a regular file or a program. In the second case, the server may (depending on its configuration) run the program, sending its output as the required page. A query string is a part of the URL which is passed to the program. Its use permits data to be passed from the HTTP client 190 (often a browser) to the program which generates the hypertext document.
A program receiving a query string can ignore part or all of it. If the requested URL corresponds to a file and not to a program, the whole query string is usually ignored.
The problem being solved herein is two-fold: the validation step still requires a network 180 round-trip and its latency reduces the performance observed by the user of the client 190 system; and many web server 110 administrators choose not to employ the expiration mechanism because of several reasons primarily that they have lost control of the cached item and cannot cause it to be un-cached, in addition to the time necessary to decide whether to cache or not to cache and determine an appropriate caching policy. These questions are hard to answer correctly, easy to answer incorrectly, and embarrassing to explain to senior management when errors or omissions cannot be fixed.