Web caching is generally recognized as an important service for alleviating focused overloads when certain web content data stored on a web server become popular. A user will have a user internet protocol address (IPA) and will select a uniform resource locator (URL) identifying the sought after web content data and the corresponding web server storing the web content data. The user makes use of the domain name system (DNS) and is provided with a DNS server IPA. The DNS system cross references a web server name, contained in a URL, to the corresponding destination web server IPA. The web server name and user IPA are transmitted to a DNS server at the DNS IPA. The DNS server then returns to the user at the user IPA the destination IPA of the web server storing the sought after URL web content data. The user then transmits the user IPA, the destination IPA and URL as a hypertext transport protocol (HTTP) protocol message into the internet where the http message is routed and forwarded through internet routers to the web server at the destination IPA where the web server then returns to the user at the user IPA the requested URL web content data. Web caching introduces a web content data cache store proximal to the user to reduce retrieval time latency of sought after URL web content data.
A web caching system consists of one or more caches that store copies of web pages, images and other web content data with the expectation that the stored copies will be repeatedly requested. A purpose of the web caching system is to reduce both the number of requests received by a web server where the desired content data is located, while providing a faster web interaction experience for the user. The web caching system reacts and adapts to user browsing behavior. Hot spots develop from time to time when user browsing behavior creates network congestion in the internet topological vicinity of and sustained workload at a particular web server. The JPL Mars Pathfinder landing, the Starr Report, and downloads of updated Netscape Communicator, a trademark of Netscape, and Internet Explorer, a trademark of Microsoft, browsers are several examples of activity that previously generated internet wide hot spot events. A more recent phenomenon are short lived hot spots caused by the traffic generated by portal sites, typically news and sports services, where the web content data is periodically changed and updated during the course of the day causing users to periodically refresh their copy of the web content data.
Web caching systems may be designed as stand alone or cooperative systems. The difference between these two types of caching systems is whether or not a cache interacts with another cache while processing a user's request. Each user request for web content data is identified using the URL. When a proximal stand alone cache receives a user request, the proximal stand alone cache checks whether or not the URL web content data is locally stored, either in the proximal stand alone cache memory and disk storage devices. If the URL web content data is locally stored by the stand alone cache, the URL web content data is immediately sent back to the requesting user. Otherwise, the proximal stand alone cache fetches the URL web content data directly from the designated web server.
A cooperative caching system, by contrast, is a system where web caches interact with each other in order to share stored web content data. When a proximal cooperative cache receives a user request, the proximal cache also checks whether or not the URL web content data was previously and locally stored. Again, if the proximal cooperative cache has stored the URL web content data, the URL web content data is sent to the requesting user. If the URL web content data is available from another distal cooperative cache, the proximal cache sends the user request to the distal cooperative cache. Otherwise, the proximal cache fetches the URL web content data from the designated web server.
The Squid web caching system is an example of a cooperative web caching system. In the Squid system, caches are grouped together in peer hierarchical groups, where the peer groups have a parent and child relationship with each other. A proximal cache in the Squid web caching system first checks to see if a user requested URL web content data is stored locally. If the URL web content data is not locally stored by the proximal cache, the proximal cache sends the request to all caches in the proximal cache peer group. If the proximal sending cache does not receive a reply from any cache in the peer group, the proximal cache sends the user request to a distal cache in the parent peer group. The process of checking whether the URL web content data is locally stored, querying the other caches in the peer group, and subsequently sending the user request to the next parent peer group repeats when the URL web content data is not stored by the cache or any of the caches in the peer group. The process stops when a root peer group is encountered, that is, a peer group that does not have a parent peer group. A cache in the root peer group also checks whether the requested URL content data is locally stored, and if not, the cache in the root peer group fetches the requested URL content data directly from the web server named in the URL. The URL web content data is stored by the root peer group cache and sent from the root peer group cache back to the cache that relayed the user request to the root peer group. The URL web content data is subsequently stored and propagated down through the caches in the peer groups through which the user request was relayed until the URL web content data reaches the proximal cache that originally received the request from the user, at which time, the proximal cache sends the URL web content data to the user. The disadvantage of this cooperative caching system is that the caches do not forward user requests between peer groups other than following the peer group hierarchy. These and other disadvantages are solved or reduced using the present invention.