Internet Service Provider (ISP) networks currently carry a large amount of audio-visual media traffic. This traffic has a significant degree of duplication: a few video clips can comprise a large proportion of all media traffic. Because of the high cost network bandwidth, particularly at ISP peering points, the possibility of caching this traffic is very attractive. However, in spite of great industry interest, caching has so far not been widely deployed in this context. A major obstacle to deployment has been that caching web proxies today are typically not fully transparent. Even though a web proxy often can appear to be transparent to the client in the sense that no explicit configuration is required, it is normally not transparent to the content provider (CP), because the proxy forwards client requests to the CP with its own IP address rather than the client's. Such a proxy is said to be transparent at the application level (the application being HTTP) but not at the network level (in this case TCP/IP). The lack of network transparency means that important information about the requesting client such as location is hidden from the CP. In addition, requests that come from many clients can appear to the CP to be coming from a single source—the proxy—which makes it difficult for CPs to accurately determine the number of unique hits, and in turn reduces the CP's web advertisement revenue.
For these and other related reasons, CPs are generally opposed to caching web proxies and either block requests coming from suspected proxies, or make caching very difficult through techniques such as URL obfuscation. Therefore, ISPs considering deploying caching web proxies desire network (rather than only application) level transparency, which means among other things that requests passing through a proxy must preserve the client's source IP address. This preservation of address while intercepting the connection is called spoofing, that is, IP spoofing generally means sending an IP packet with a source IP address other than your own.
If a proxy is deployed in-line, or using a suitable redirection technology such as WCCP (Web Cache Coordination Protocol), it is possible for that proxy to spoof the source IP address and maintain network level transparency to the CP. Indeed, many popular proxies support this type of deployment. However, a problem common to all current spoofed web caching solutions is that all web traffic, both from the client to the CP and returning from the CP back to the client, must pass through the web proxy. The reason is that spoofed traffic, even though it originates from the proxy, carries the client's source IP address. Therefore, returning response traffic from the CP is destined for the client rather than the proxy, and no network entity other than the proxy is aware of the spoofing. To support spoofing, all web traffic to and from the CP must be passed through the web proxy, including flows that the proxy decides not to cache. Because the fraction of web traffic which the proxy decides to cache can be relatively small, this means that a significant portion of the proxy's processing resources can be wasted forwarding return packets that should have been routed directly to the user rather than redirected to the proxy.