The Internet, and other networks based on the Internet Protocol (IP) in general, has evolved from a simple information sharing platform to a critical infrastructure for commerce, entertainment, education, government and many other aspects of personal and institutional life. The amount of traffic flowing through this infrastructure increases on a daily basis and this trend expected to continue in the future. Consequently, network administrators and providers that must support the Internet infrastructure are faced with continuing challenges occasioned by such growth.
Caching is a popular, and perhaps the only, technique which has been used to try and meet the need for improved user experiences in the face of increased network traffic over the Internet. In practice, two caching techniques are used: symmetric caching and asymmetric caching. In symmetric caching, traffic flowing through multiple network nodes is cached on caching devices deployed at these nodes. Network traffic between the nodes can then be reduced by transferring tokens representative of the cached contents between the cache devices. The technique is symmetric, as it requires two nodes to cooperate with each other.
In asymmetric caching, traffic flowing through a network node is analyzed and that portion of the traffic that can be identified by well-known naming schemes, such as a Uniform Resource Indicator (URI) in Hypertext Transfer Protocol (HTTP) traffic, a filename in Common Internet File System (CIFS) traffic, etc., is cached on a caching device installed on the node. The cached contents are then used to reduce traffic between the caching node and the content source node for the named contents (e.g., the origin server for a web page or the like). The traffic reduction is achieved by responding to requests for the named contents out of the cache, rather than passing on requests for copies of the original contents to the source node. Asymmetric caching is popular in application proxy gateways and is so named because it does not require participation from other nodes (such as the origin server, etc). ProxySG™ from Blue Coat Systems, Inc., Squid from the open source community and ISA™ from Microsoft Corporation are representative examples of asymmetric caching implementations.
There are technical and practical limitations of existing symmetric and asymmetric caching techniques. For example, it is not practical, and oftentimes is cost prohibitive, to deploy caching devices to cover all possible network paths. Furthermore, different techniques use different protocols between cooperating caching devices. Symmetric caching techniques are therefore only suitable within network environments where the presence of compatible caching devices can be assured, such as enterprise networks with widely distributed branch offices.
While asymmetric caching techniques can be employed more generally within heterogeneous network environments than can symmetric caching techniques, asymmetric caching is increasingly becoming less effective in reducing network traffic. In part, this is due to the evolving, dynamic nature of content naming schemes and of the content itself. Most asymmetric caching techniques use application-specific naming schemes as identifiers to the cached contents. These techniques had shown good cache results (e.g., good cache hit rates) in the early days of the Web, when most content was static and fixed naming schemes were used to refer to that content. In today's Web world, however, an ever increasing number of content naming schemes are being used to refer to the same content. Using content naming schemes as content identifiers now often results in very poor cache hit rates. Further, as more and more content items are being delivered based on combinations of fixed naming schemes and other, dynamic parameters, such as cookies, arguments and time codes, asymmetric caching installations that rely solely on fixed naming schemes will often deliver the wrong cached contents.