The present invention relates to transmission of data in a network environment. More specifically, the present invention relates to methods and apparatus for improving the efficiency with which data are transmitted over the Internet. Still more specifically, the present invention provides techniques by which the efficacy of an Internet cache may be improved.
Generally speaking, when a client platform communicates with some remote server, whether via the Internet or an intranet, it crafts a data packet which defines a TCP connection between the two hosts, i.e., the client platform and the destination server. More specifically, the data packet has headers which include the destination IP address, the destination port, the source IP address, the source port, and the protocol type. The destination IP address might be the address of a well known World Wide Web (WWW) search engine such as, for example, Yahoo, in which case, the protocol would be TCP and the destination port would be port 80, a well known port for http and the WWW. The source IP address would, of course, be the IP address for the client platform and the source port would be one of the TCP ports selected by the client. These five pieces of information define the TCP connection.
Given the increase of traffic on the World Wide Web and the growing bandwidth demands of ever more sophisticated multimedia content, there has been constant pressure to find more efficient ways to service data requests than opening direct TCP connections between a requesting client and the primary repository for the desired data. Interestingly, one technique for increasing the efficiency with which data requests are serviced came about as the result of the development of network firewalls in response to security concerns. In the early development of such security measures, proxy servers were employed as firewalls to protect networks and their client machines from corruption by undesirable content and unauthorized access from the outside world. Proxy servers were originally based on Unix machines because that was the prevalent technology at the time. This model was generalized with the advent of SOCKS which was essentially a daemon on a Unix machine. Software on a client platform on the network protected by the firewall was specially configured to communicate with the resident demon which then made the connection to a destination platform at the client""s request. The demon then passed information back and forth between the client and destination platforms acting as an intermediary or xe2x80x9cproxy.xe2x80x9d
Not only did this model provide the desired protection for the client""s network, it gave the entire network the IP address of the proxy server, therefore simplifying the problem of addressing of data packets to an increasing number of users. Moreover, because of the storage capability of the proxy server, information retrieved from remote servers could be stored rather than simply passed through to the requesting platform. This storage capability was quickly recognized as a means by which access to the World Wide Web could be accelerated. That is, by storing frequently requested data, subsequent requests for the same data could be serviced without having to retrieve the requested data from its original remote source. Currently, most Internet service providers (ISPs) accelerate access to their web sites using proxy servers.
Unfortunately, interaction with such proxy servers is not transparent, requiring each end user to select the appropriate proxy configuration in his or her browser to allow the browser to communicate with the proxy server. For the large ISPs with millions of customers there is significant overhead associated with handling tech support calls from customers who have no idea what a proxy configuration is. Additional overhead is associated with the fact that different proxy configurations must be provided for different customer operating systems. The considerable economic expense represented by this overhead offsets the benefits derived from providing accelerated access to the World Wide Web. Another problem arises as the number of WWW users increases. That is, as the number of customers for each ISP increases, the number of proxy servers required to service the growing customer base also increases. This, in turn, presents the problem of allocating packet traffic among multiple proxy servers.
Another technique for increasing the efficiency with which data requests are serviced is described in commonly assigned, copending U.S. patent application Ser. No. 08/946,867 for METHOD AND APPARATUS FOR FACILITATING NETWORK DATA TRANSMISSIONS filed Oct. 8, 1997, the entirety of which is incorporated herein by reference for all purposes. The invention described in that copending application represents an improvement over the proxy server model which is transparent to end users, high performance, and fault tolerant. By altering the operating system code of an existing router, the router is enabled to redirect data traffic of a particular protocol intended for a specified port, e.g., TCP with port 80, to one or more caching engines connected to the router via an interface having sufficient bandwidth such as, for example, a 100baseT interface. If there are multiple caching engines connected to the cache-enabled router, the router selects from among the available caching engines for a particular request based on a simple algorithm according to which a particular group of addresses is associated with each caching engine.
The caching engine to which the request is re-routed xe2x80x9cspoofsxe2x80x9d the requested destination platform and accepts the request on its behalf via a standard TCP connection established by the cache-enable router. If the requested information is already stored in the caching engine, i.e., a cache xe2x80x9chitxe2x80x9d occurs, it is transmitted to the requesting platform with a header indicating its source as the destination platform. If the requested information is not in the caching engine, i.e., a cache xe2x80x9cmissxe2x80x9d occurs, the caching engine opens a direct TCP connection with the destination platform, downloads the information, stores it for future use, and transmits it to the requesting platform. All of this is transparent to the user at the requesting platform which operates exactly as if it were communicating with the destination platform. Thus, the need for configuring the requesting platform to suit a particular proxy configuration is eliminated along with the associated overhead. Moreover, traffic may be easily allocated among as many caching engines as become necessary. Thus, content caching provides a way to compensate for the bandwidth limitations discussed above.
The success of content caching in compensating for bandwidth limitations corresponds directly to the efficiency with which the caching engines operate. The higher the cache hit rate, i.e., cache hits as a percentage of the total number of requests, the greater the bandwidth savings. For a typical caching engine, the cache hit rate is approximately 30 to 40%. This percentage includes cache misses for non-cacheable objects. This means that 60 to 70% of objects stored in caching engines are never used again. That is, 60 to 70% of the caching engine""s storage is used to store objects which will never be requested again. In addition, because new objects are constantly replacing old objects, it is likely that some of the 30 to 40% of objects which are likely to be requested more than once are being overwritten by the objects which will never be requested again. It is therefore clear that the typical caching engine is working nowhere near the level of efficiency which is at least theoretically possible.
Techniques for improving caching efficiency are described in commonly assigned copending U.S. patent application Ser. No. 09/259,149 for METHODS AND APPARATUS FOR CACHING NETWORK TRAFFIC filed on Feb. 26, 1999, the entirety of which is incorporated herein by reference for all purposes. The invention described therein achieves improvements in caching efficiency by favoring the caching of objects which are statistically likely to be requested again. According to a one embodiment, when a caching engine experiences an initial cache miss for a requested object, the object is retrieved and sent to the requesting host but the object is not cached. Instead, the caching engine makes an entry corresponding to the requested object in a table in which it tracks objects for which at least one cache miss has occurred. If another request for the object is received, the object is retrieved, sent to the requesting host, and, because an entry corresponding to the requested object exists in the table, the object is cached. In other words, an object is only cached if it has been requested at least twice. The idea is that if an object has been requested two or more times it is statistically more likely to be requested again than an object for which only one request has been received. It follows then that, because the cache is populated only by objects which are likely to be requested, cache efficiency is correspondingly improved.
When a cache memory fills up, the cache must discard some stored objects to free up space for incoming objects. Existing approaches for evicting objects from a cache include, for example, simple first-in-first-out (FIFO), least-frequently-used (LFU), and least-recently-used (LRU) schemes. A more sophisticated algorithm for evicting objects from a network cache is described in commonly assigned, copending U.S. patent application Ser. No. 09/583,588 for METHODS AND APPARATUS FOR IMPROVING CONTENT QUALITY IN WEB CACHING SYSTEMS filed on May 31, 2000, the entirety of which is incorporated herein by reference for all purposes. The common thread among most such schemes is the attempt to maximize the number of access hits to objects already stored in the cache memory by tracking the number of access hits for each object in the cache.
Unfortunately, even if such object eviction schemes achieve the goal of maximizing the number of hits per object in a cache, this does not necessarily translate into an improved quality of service for particular end users. That is, as users navigate the Web, they download content from a wide variety of geographic locations via a wide variety of narrow and wide band connections. While it is clearly beneficial from the user""s perspective to cache content from a geographically distant location which transmits data via a narrow band connection, there will likely be no perceptible improvement if content is cached from a nearby platform with which the user has a wide band connection. While the use of the cache in the former case is clearly indicated, its use in the latter would clearly be a waste of cache memory resources.
Therefore, despite the improvements represented by all of the techniques described above, and given the value of any improvement in the usage of network transmission resources, it is desirable to improve the efficacy with which network caching systems cache and discard data objects.
According to the present invention, techniques are provided which employ at least one metric relating to the time for accessing and/or downloading a data object either to determine when to discard the object from a network cache memory or to determine whether to cache the object in the first place. As mentioned above, other techniques track the number of times an object has been accessed in, for example, some given time period. According to various embodiments, the present invention uses at least one or various combinations of a plurality of metrics including, for example, the access or download time for the object, to develop a cost function for each object according to which the handling of the object may be determined. According to various embodiments, these metrics may also include the number of accesses for the object, the size of the object, or the bandwidth required to download the object.
According to a specific embodiment, the cost function is calculated for each object in a network cache memory, and the objects are sorted in order of their cost function value. As each new object arrives at the network cache it is stored according to its cost function value. When the cache memory becomes full, the objects with the least favorable cost function values are discarded to accommodate new objects. According to another specific embodiment, a cost function value is calculated for each potentially cacheable object to determine whether the object should be cached.
Thus, the present invention provides methods and apparatus for handling objects in a network cache. A cost function value is calculated for each of a plurality of data objects. The cost function value is determined with reference to at least one metric which relates to a total time required to download a corresponding one of the plurality of data objects and provides a relative measure of a cost of caching the corresponding object. Each of the plurality of data objects is handled according to its cost function value.
According to a specific embodiment, the plurality of metrics for each object includes at least one of an access time, a download time, an object size, and a number of object requests. According to another specific embodiment, handling each of the plurality of data objects according to its cost function value comprises determining whether each of the data objects is cacheable with reference to its cost function value. According to yet another specific embodiment, handling each of the plurality of data objects according to its cost function value comprises sorting a subset of the plurality of data objects stored in the network cache with reference to the corresponding cost value functions thereby generating a sorted list. According to a more specific embodiment, selected ones of the subset of data objects are evicted from the network cache to accommodate storage of new data objects based on positions of the selected data objects in the sorted list.
A further understanding of the nature and advantages of the present invention may be realized by reference to the remaining portions of the specification and the drawings.