1. Field of Invention
The invention relates generally to memory systems. More particularly, methods and apparatus for managing the available memory space in a cache memory is disclosed.
2. Description of Relevant Art
The explosive growth in Internet traffic has made it critical to look for ways of accommodating the increasing number of users while preventing excessive delays, congestion and widespread blackouts. For the past several years a large proportion of Internet traffic has been generated by web-browsing and most of the web browsing traffic is in turn comprised of images. This trend will continue with more sites becoming media rich as the users' access bandwidths increase and media-enabled PCs become more common. This has led to a great deal of effort being devoted to studying web caching techniques.
One such web caching technique is referred to as proxy based caching. With proxy based caching, clients can designate a host to serve as a proxy for all or some of their requests (e.g., http, ftp, etc). The proxy acts as a cache by temporarily storing on local disks, or memory, objects that were requested by the clients. It should be noted that in the context of the Internet, an object is a file, text, image, audio, executable program, and the like which is available to be downloaded by a client. When one of the clients sharing the proxy generates a request, the proxy searches its local storage for the requested object. If the object is available locally (hit) it is sent to the client, otherwise (miss) the request is passed on to the remote server or to another proxy server, if the proxy cache belongs to a hierarchy of caches. Unfortunately, if the ratio of number of misses to hits is even relatively small, then the performance of the browser can be substantially degraded as the requested documents must be retrieved from other proxies or the remote servers.
Another well known web caching technique used to improve performance is referred to as client side caching. Client side caching is the temporary storage of web objects (such as HTML documents) in local cache memory for later retrieval. Advantages to client side caching include: reduced bandwidth consumption since fewer requests and responses that need to go over the network are created, reduced server load since fewer requests for a server to handle are made, and reduced latency since responses for cached requests are available immediately and closer to the client being served. In most cases, client side caching can be performed by the client application and is built into, for example, Netscape's Communicator, Microsoft's Internet Explorer, various Java browsers, as well as most other web browsers.
Since web browsers (and the computers that support them) have only finite disk space, they must eventually discard old copies of stored documents to make room for new requests. Most systems use a policy based on discarding the least recently requested document as being the least likely to be requested again in future. Given sufficient disk space, documents are discarded around the time when they would, in any case, have been replaced by a more up-to-date version. However, if disk space is insufficient then the cache may be forced to discard a current document and make an unnecessary connection to the source host when the document is next requested. The amount of disk space required depends on the number of users served and the number of documents requested. Ideally a cache should have room to store every document which the users of the cache request more than once during the lifetime of the document. Such a cache would never retrieve a second copy of an unchanged document and thereby generate the minimum possible network traffic. To achieve this in practice would, of course, involve storing every requested document locally since there is no way to predict which documents will be re-read in the future.
Since storing every requested document for at least its lifetime is impractical, the number of documents which must be reclaimed (or garbage collected) from the cache memory is directly related to the available cache memory space. As is well known in the art, garbage collection is a process whereby inactive, or otherwise unneccessary objects, are identified, collected, and deactivated. If too many documents, or if those documents which are frequently requested are garbage collected, system performance is degraded since the document must be retrieved from the network if it is not stored in the local cache memory. Conversely, if documents that are not frequently requested are not periodically garbage collected, then the limited memory space available in the local cache memory can be quickly saturated. When this occurs, no new documents can be stored and, again, system performance is degraded since requested documents not stored in the local cache memory must again be retrieved from the network.
Therefore, what is desired is an efficient method and apparatus for intelligently purging documents from a local cache memory in a browser environment based upon the available cache memory and browser traffic.