The Hypertext Transfer Protocol (HTTP) is an application protocol for distributed information systems and is the foundation of data communication for the World Wide Web. A client computer submits an HTTP request message to a server computer. The server, which stores content, or provides resources, such as HTML files, or performs other functions on behalf of the client, returns a response message to the client. A response contains completion status information about the request and may contain any content requested by the client in its message body. The HTTP protocol is designed to permit intermediate network elements, such as proxy servers, to improve or enable communications between clients and servers.
An intermediate server between the requesting client computer and the origin server may cache responses from the origin server and return subsequent requests for the same content directly. A cache hierarchy is a collection of caching proxy servers organized in a logical parent/child arrangement so that caches closest to the origin server act as parents to caches closer to the client computer. For example, a request from a client computer to an origin server computer may go through a series of proxy servers arranged in a hierarchical manner. The first proxy server receiving the request searches its cache for the proper content. If the content is not found (termed a “cache miss”), the first proxy server requests the content from the next proxy server in the hierarchical line which in turn searches its own cache. If the “parent” locates the content (“cache hit”), it returns the content to the “child” without passing the request further. The child, in turn, returns the content to the client computer.
When a cache has a stale entry that it would like to use as a response to a client's request, it first has to check with the origin server (or possibly an intermediate cache with a fresh response) to see if its cached (stale) entry is still usable. This is known as “validating” the cache entry. An entity tag, or ETag, is part of the HTTP protocol. More specifically, ETags are part of HTTP version 1.1 or later as earlier versions did not support ETags. ETags are one mechanism that HTTP provides for cache validation, and which allow a client to make conditional requests. This allows caches to be more efficient, and saves bandwidth, as a server does not need to send a full response if the content has not changed.
An ETag is an opaque identifier typically assigned by an origin server to a specific version of a resource found at a uniform resource locator (URL). “Opaque” is used to denote that the ETag is unique to the computer generating the ETag. Another computer generating an ETag on the same version of the same resource would not produce the same ETag. If the resource content at the URL ever changes, a new and different ETag is assigned. Used in this manner, ETags can be quickly compared to determine if two versions of a resource are the same or are different. The use of ETags in the HTTP header is optional.
In typical usage, when a computer requests a resource, the server assigns an ETag to the resource and returns the resource along with the corresponding ETag value, which is placed in an HTTP “ETag” header field. The computer may then cache the resource along with the corresponding ETag. Later, if the computer requests the same resource, the computer sends the request and the ETag, the ETag being in an “If-None-Match” HTTP header field. On this subsequent request, the server may now compare the client's ETag with the ETag for the current version of the resource. If the ETag values match, meaning that the resource has not changed, then the server may send back a very short response with an HTTP “not modified” status. This status tells the computer that its cached version is current and should be used, saving the bandwidth that would otherwise be used to send the resource.