1. Field of the Invention
The present invention pertains to data access network systems (e.g., Internet or intranet systems). More particularly, this invention relates to improving content consistency between a proxy server and a content server in a data access network system in a cost effective manner and with minimal network data traffic.
2. Description of the Related Art
An example of a data access network system is the Internet or an intranet network system. An Internet/intranet network system typically includes a number of data service systems and Internet Service Provider (ISP) systems connected together via interconnect networks. The data service systems typically include web content servers that host content for various customers or applications. The customers are the owners of the content hosted in the data service systems such that subscribers or users can access the content via their computer terminals via the ISP systems. The content owners are typically referred to as Content Providers. The data service systems may also be referred to as content servers. The content servers typically utilize Internet applications, such as electronic mail, bulletin boards, news groups, and World Wide Web access. The hosted content is arranged in the form of content sites within the content servers. Each site may include a number of pages (e.g., World Wide Web pages).
Access to the web pages by the users via their terminals is typically accomplished using the HTTP (Hyper Text Transfer Protocol) protocol. The HTTP protocol is a request-and-response protocol. When a user at a terminal (e.g., a personal computer) designates a particular web page, at least one request is generated. The actual number of requests is dependent upon characteristics of the designated web page. A web page may include one or more “objects” or files. A multi-object page can be more aesthetically pleasing than a plain page, but each object requires a separate request by the browser and a separate to response by a server.
The total time to download a Web page or other Internet document (e.g., an FTP file) depends on a number of factors, including the transmission speeds of communication links between a user terminal and a server on which the requested file is stored (i.e., content server), delays that are incurred at the server in accessing the document, and delays incurred at any intermediate device located between the user terminal and the content server, including the data access network. In addition, whenever a Web page or file is again requested by the same user terminal at a later time, the same download process may be repeated, which creates unnecessary and redundant network traffic in the data access network system.
To reduce delay and network traffic, proxy servers are provided in the intermediate devices between the user terminals and the content servers to temporarily cache Web page files. This prior art arrangement is shown in FIG. 1. An important benefit of employing the proxy server is the ability to cache objects received from the remote content servers. This allows the cached objects to be quickly retrieved and sent to the client device if objects are again requested. Some of the cached objects may be requested by the same or different client device at later times.
As can be seen from FIG. 1, when a user terminal 12 generates a request for a particular object (e.g., the object 10 stored in the remote server 18), the cache of the proxy server 16 in the local server 14 is searched to determine whether the object 10 is stored at the proxy server 16. If the object is not found in the cache of the proxy server 16, a “cache miss” results and the local server 14 directs the request to the remote server 18 via the Internet 20.
As can be seen from FIG. 1, the remote server 18 stores the requested object 10. Once the remote server 18 receives the request, it directs a response with the requested object 10 to the client device 12 via the local server 14. During this process, the requested object 10 is also cached in the proxy server 16 of the local server 14. This eliminates the need for the local server 14 to send another request to the remote server 18 for the same object 10 at a later time when either the same client device 12 or a different client device (not shown) requests the same object 10. When the object 10 is again requested, the proxy server 16 is accessed and a “cache hit” results. In this case, the cached object 10 is quickly forwarded to the client device directly from the proxy server 16. This eliminates delays encountered in communicating between the proxy server 16 and the remote server 18. By storing copies of objects received from remote sites, the proxy server 16 reduces the number of requests that are directed to the remote server 18, as well as the traffic on the Internet 20 as a result of transmitting the responses in the form of a number of packets that must be reassembled at the client device 12. Caching can delay the need to provide additional network resources, reduce peak demand on the network link from an ISP to the external Internet, and improve client response time. These factors lead to lower ongoing operating costs and increased user satisfaction.
However, disadvantages are associated with this prior art caching arrangement. One disadvantage is that the prior art caching arrangement lacks content consistency between the contents stored in the proxy server and that stored in the content server. This means that if the content of an object or file stored in the content server is updated or otherwise changed, that change is not propagated to the proxy server that caches the same object. The proxy server has no way of knowing whether the content stored in the proxy server is consistent without querying the original content server. In this case, the cached and un-updated object from the proxy server, not the updated object from the remote content server, is retrieved by the user from the proxy cache when the object is requested.
One prior art solution to this problem is to have the proxy server check the remote content server every time the proxy server is accessed. By doing so, the proxy server can assure that it serves consistent data to the users. This, however, comes at the cost of additional round trip connections to the origin content servers, which adds considerable delay to the servicing of the user requests. It also increases network traffic and the workload of the original content servers. This solution basically defeats many of the benefits of providing the proxy servers.
Another prior art solution to this problem is to only cache an object in the proxy server for a predetermined time period. Within that time period, the proxy server serves every request for that object locally from its cache without contacting the remote content server. After the time period has lapsed, the proxy server evicts the object from its cache. One disadvantage of this approach is that there is no content consistency assurance during the time period the object is cached in the proxy server because the object may be updated or changed during that time period. Another disadvantage is that after the time period, the object may still be the same even if it is evicted from the proxy server. This clearly will increase the network traffic when the same object is again requested.