The Internet, and in particular the World-Wide Web (“WWW”), is a large collection of computers operated under a client-server computer network model. In a client-server computer network, a client computer requests information from a server computer. In response to the request, the server computer provides the requested information to the client computer. Client computers are typically operated by individuals. Server computers are typically operated by large information providers, such as commercial organizations, government entities and universities.
To ensure the interoperability of the potentially different computers in a client-server computer network, various protocols are observed. For example, the Hypertext Transport Protocol (“HTTP”) is used for transporting hypertext files over the Internet. In addition, the WWW observes a number of protocols for organizing and presenting information, such as the Hypertext Markup Language (“HTML”) protocol and the Extensible Markup Language (“XML”) protocol.
Various protocols, and in particular the HTTP protocol, support a feature known as “dynamically-generated customized pages.” A dynamically-generated customized page comprises a set of content (information) in a particular format. The same set of information can be presented in various ways, depending upon whether a particular format is supported by the requesting client computer. For example, a first client computer may support the ability to present information in columns, while a second client computer may instead support the ability to present information in the form of a table. A server computer receiving an information request from the first client computer will dynamically generate the requested content in a column format, while responding to a request from the second client computer with a dynamically-generated page that presents the requested content in the form of a table. Thus, two different customized pages can be created to represent the same content. Any protocol that exchanges client capability information can provide this functionality. For example, 3G (third generation) wireless protocols, WAP (wireless application protocol) and FTP (file transfer protocol).
Computer executable instructions are used to dynamically generate customized pages (content). U.S. Pat. No. 5,740,430, entitled “Method and Apparatus for Server Independent Caching of Dynamically-generated Customized Pages,” issued on Apr. 14, 1998, to Rosenberg, et al. (the “Caching Application”), discloses a method and apparatus to efficiently respond to a large number of requests for customized pages. In particular, the Caching Application discloses a method and apparatus for operating a client-server computer network such that a server computer dynamically generates and then stores customized pages requested from a client computer. Subsequent requests for previously generated customized pages are retrieved from a cache in the server computer. Since previously generated customized pages need not be regenerated, computational overhead is reduced. The Caching Application is hereby incorporated by reference in its entirety.
The invention disclosed in the Caching Application is highly advantageous because it does not require the dynamic generation of a customized page every time a customized page is requested. However, it suffers from several deficiencies. One such deficiency relates to the deletion of previously cached customized pages in response to, for example, a change in the template from which a dynamically-generated customized page is generated.
Dynamically-generated pages (“DGPs”) comprise dynamically-generated content (“DGC”) that can be in the form of components. Content components on a page can be changed (updated) independently of one another, in groups, or as a whole. DGPs (and DGC) are generated from templates in a computer system (such as a server computer). The server computer includes stored computer executable instructions (“computer programs”) to dynamically generate the DGC (for purposes of this document, DGC will be used to include DGPs, and vice versa, unless otherwise stated). The templates from which the DGPs is generated can be, and frequently are, changed, in response to the requirements of a given application. When a template is changed, the DGPs previously generated from the template and cached for rapid retrieval must be changed to accurately reflect the current template.
In addition to template changes, cache management such as this can be made necessary in response to other system events. For example, the content provided on a particular page may have changed or expired. Content can expire, for example, based on an arbitrarily set timer (e.g., each cached file is assigned a lifetime of a preset number of minutes). The cache management system might further incorporate a general update timer based on, for example, time of day or time since a last general update of all cached files. Any such event (including an explicit one-time call to update a cache) requiring an update of cached files can result in the need for cache management to keep content current. A page deletion or flushing, as defined below, will also require management of cached files.
For example, when a template is changed in a client-server network implementing a method and system such as that disclosed in the Caching Application, the currently cached pages generated from the template are “flushed” from memory. When a page (or any file) is flushed, it is made inaccessible to the file system, but may remain available for backup purposes. This can be accomplished by simply renaming the file. Alternatively, the file can be deleted. A flushing operation thus can refer to an operation that reduces the content population in a cache.
When a request for a flushed customized page is received from a client computer, the requested file (page) will not be available and the page must be regenerated from the new template, for example, according to the teachings of the Caching Application.
However, if the flushed page is a heavily requested page, a large number of simultaneous (concurrent) requests for the page may be received in the time period between the flushing operation and the regeneration of the DGP following the first client request for the page.
This situation can result in a load spike and overburdening of the server computer because each concurrent request for the DGP will result in a demand for the page to be regenerated. The server, and network, performance can be adversely affected by the increase in page regeneration requests. The ability of the server to respond to requests for other content, or to requests for the same page, will be disrupted. A ripple effect through the server and network can also occur because the sudden increase in content regeneration requests will reach back to the back-end processes and databases. A cached page, or series of pages, can also be explicitly flushed, resulting in the same decrease in performance.
Another problem exists in that typically, a family of customized pages is generated from a single template. For example, different language specific versions of a customized page may have been cached, all generated from the same template. Additionally, different versions of a page can be generated based on, for example, browser capabilities. When a change is made to such a template, it has a ripple effect through the many customized page versions generated from the template.
However, it may be that not all affected pages need to be regenerated. For example, some pages may be requested so infrequently that it is computationally inefficient to regenerate these low-frequency pages at the same time as other, more frequently requested, versions of the customized page. Prior art methods for managing cached dynamically-generated customized pages do not provide for selective regeneration of pages. Lastly, simultaneous regeneration of all the selected page versions can have the same adverse affect on server and network performance as an equal number of client-initiated requests.