The rapid increase in popularity of the World Wide Web (WWW or web) has led to a corresponding increase in the amount of traffic over the Internet. As a result, the web has become a primary bottleneck on network performance. When documents or information are requested by a user who is connected to a server via a slow network link, there can be noticeable latency at the user end. To avoid the long wait for "pulling" the requested documents, an alternative is to have the content provider "push" the documents to the users based on pre-specified user preferences or profiles as soon as relevant documents become available.
The push alternative has the tendency of overflowing the network. This is often due to the fact that users tend to inadequately specify their preferences so that too many documents get pushed to the users.
Under the conventional "pull" approach, one way to reduce access latencies is to cache copies of popular documents or information closer to the user, from which the access latencies are more acceptable. The caching can be implemented at various points on the network. For example, a large university or corporation may have its own local cache, from which all the users subscribing to that network may fetch documents. In some cases specialized servers called caching proxies, which can act as agents on the behalf of the client, are implemented in the network to locate a cached copy of a document. Typically, caching proxies serve as secondary or higher level caches, because they are concerned only with cache-misses from (primary) client caches. Client caches are typically part of the web browser, and may store either documents accessed during the current invocation (a nonpersistent cache such as is implemented by Mosaic), or documents accessed across invocations.
Generally speaking, a hierarchy of proxies can be formed by the client and server(s). For example, in a corporate network, there can be one or more of a project proxy, a departmental proxy, a divisional proxy and a site proxy, etc. An Internet service provider can implement proxies at one or more of each neighborhood, each sub-region, and each region, etc. The client and/or proxies form a caching hierarchy. In a strict hierarchy, when a cache miss occurs, the (client or) proxy requests the missed object from the immediately higher level of the hierarchy through a caching proxy interface, such as the HTTP interface as used in the CERN HTTP cache. More recently, in Harvest, "sibling" or "neighborhood" caches may be interrogated upon a cache-miss (see C. M. Bowman, et. al., "Harvest: A Scalable, Customizable Discovery and Access System," in Technical Report CU-CS-732-94, Department of Computer Science, University of Colorado, 1994). In either case, the caching decision is made at each local proxy independent of objects cached in other proxies. In other words, caching decisions are made solely as a function of the local cache contents and/or object characteristics.
Thus, there is a need for a push-based filtering method and system which exploits the proxy server hierarchy and which is based on actual usage behavior of the viewers. Furthermore, there is a need for a system and method whereby staging decisions can be made based on the push filtering decisions and the outcome of the push activities. There is also a need for a way to make the proxy hierarchy work more effectively by communicating or exchanging information among the proxy servers, the content servers and the clients. The present invention addresses the aforementioned needs.