To access a document (e.g., a webpage) on the Internet, a user must download the document from a document source to a client computer using a software application such as a web browser. A document source is typically a web host (sometimes called a web server) but can be a proxy server that prefetches the document from the web host. Upon receipt of a document request from a client, the proxy server first checks if the requested document has been prefetched and stored in its own cache. If not, the proxy server then fetches the requested document from the web host. Even if the requested document is found in the proxy server's cache, it may not be servable to the client if its content is no longer fresh. The freshness of a document's content is usually determined by an expiration timestamp value set by the content provider. If the current time is post the expiration timestamp, the document's content is deemed stale, and if the current time is prior to the expiration timestamp, the document's content is deemed current or not stale.
However, a document's expiration timestamp is not always an accurate prediction of the document content's freshness. For various reasons, there is often no content change to a document even a long time after its associated expiration timestamp is passed. As a result, a proxy server could waste resources downloading documents having identical contents as the ones currently in the proxy server's cache. Moreover, refreshing a document who content has not in fact changes may unnecessarily delay the rendering of the document by a requesting client.
In view of the foregoing, there is a need for new methods of determining the freshness of a cached document more accurately and thereby improving the performance of the proxy server as well as users' web browsing experience.