1. Field of the Invention
The present invention relates, generally, to content delivery systems, and, in preferred embodiments, to the automated construction of URL, cookie, and database query mappings in systems and methods for intelligent caching and refreshing of dynamically generated and static web content.
2. Description of the Related Art
The need to account for users"" quality perceptions in designing Web servers for e-commerce systems has been well recognized, for the brand name of an e-commerce site is often associated with the type of experience users receive. Response time is a key point of differentiation among e-commerce Web sites. Snafus and slow-downs at major Web sites during special events or peak times demonstrate the difficulty of scaling up e-commerce sites. Such slow response times and down times can be devastating for e-commerce sites as indicated in a recent study on the relationship between Web page download time and user abandonment rate. The study shows that only 2% of users will leave a Web site (i.e. abandonment rate) if the download time is less than 7 seconds. However, the abandonment rate jumps to 30% if the download time is around 8 seconds. The abandonment rate goes up to 70% when the download time is around 12 seconds. This study clearly establishes the importance of fast response times to an e-commerce Web site to retain its customers.
In technical terms, to ensure the fast delivery of fresh dynamic content and engineer highly scalable e-commerce Web sites for special events or peak times puts heavy pressure on IT staffs due to the complexity of current e-commerce applications. For many e-commerce applications, Web pages are created dynamically based on the current state of a business, such as product prices, inventory, and other information stored in database systems. This characteristic requires e-commerce Web sites to deploy cache servers, Web servers, application servers, and database systems at the backend. The roles played by these servers are illustrated in FIG. 1 and summarized as follows:
1 . A database management system (DBMS) 10 or other external data sources 26 to store, maintain, and retrieve all necessary data and information to model a business.
2. An application server (AS) 12 that incorporates all the necessary rules and business logic to interpret the data and information stored in the database. AS 12 receives user requests 14 for HTML pages and cookies 40, and depending upon the nature of a request, may need to access the DBMS 10 or external data source 26 via queries 28 or external data requests 30 and retrieve database results 32 or file/network access results 34 to generate the dynamic components of the HTML page 22.
3. A Web server (WS) 16 which receives user requests 18 and cookies 36 from end users 20 and delivers the dynamically generated Web pages 24 back to the end users 20.
4. Cache servers (edge caches or frontend caches) (not shown in FIG. 1) to accelerate content delivery.
One possible solution to scale up database-driven e-commerce sites is to deploy network-wide caches so that a large fraction of requests can be served remotely rather than being served from the origin Web site. This solution has several advantages, including improved content delivery times and reduced traffic at the Web sites. Many content delivery network (CDN) vendors provide Web acceleration services, and studies have shown that CDN can have a significant performance impact. However, for many e-commerce applications, HTML pages are created dynamically based on the current state of a business, such as product prices and inventory, rather than static information. As a result, the time to live (TTL) for these dynamic pages can not be estimated in advance, and content delivery by most CDNs is typically limited to the handling of fairly static pages and streaming media rather than the full spectrum of dynamic content.
Because the application servers, databases, Web servers, and caches are independent components, there is no efficient mechanism to have database content changes reflected in the cached Web pages. To ensure the freshness of dynamic content in the caches, integration of the caches, Web servers, application servers, and back-end database systems is required. Ideally, when updates in the database are observed, the pages which are impacted by such changes should be identified and such pages in the cache should be invalidated or refreshed accordingly. However, the information required for such integration includes the knowledge of what database queries and/or other external/internal data source access were the result of a dynamic Web page request. In other words, a mapping between URL requests and queries is required, and this knowledge is missing in conventional dynamic content caching solutions. Thus, there is a need for the automated construction of URL, cookie, and database query mappings to enable the efficient invalidation or refreshing of cached web content.
Note that knowledge about dynamic content is distributed across three or more different servers, including the Web server, the application server, and the database management server. Consequently, it is not straightforward to create a mapping between the data and the corresponding Web pages automatically. Some approaches, for example, assume that such mappings are provided by system designers. In other systems, programmers must re-engineer the application server programs to use a set of specific APIs (Application Program Interfaces) to generate such mappings.
More recently, other systems and methods for the construction of URL, cookie, and database query mappings have been proposed. For example, in U.S. Utility Patent U.S. Pat. No. 6,591,266 entitled xe2x80x9cSystem and Method for Intelligent Caching and Refresh of Dynamically Generated and Static Web Contentsxe2x80x9d (xe2x80x9cthe ""208 applicationxe2x80x9d), the embodiment of FIG. 3 in the ""208 application does not require modification of application server program, but it does over-invalidate cached pages because it is not 100% accurate. The embodiments of FIGS. 5, 6, and 7 in the ""208 application require modification of application server programs or database application programs to pass additional parameters, but they are 100% accurate. The values of the URL string and cookie must be explicitly passed to the JDBC (Java Database Connectivity) (i.e. the database connection API).
Therefore, it is an advantage of embodiments of the present invention to provide a system and method for the construction of URL, cookie, and database query mappings in which the users do not need to manually specify such mappings, as in other systems.
It is a further advantage of embodiments of the present invention to provide a system and method for the construction of URL, cookie, and database query mappings in which the existing application server programs do not need to be changed. Company Web sites may be reluctant to have their mission-critical application server programs modified by content delivery service providers. In addition, the source code may not be provided by the application server vendor.
It is a further advantage of embodiments of the present invention to provide a system and method for the construction of URL, cookie, and database query mappings which do not result in the over- or under-invalidation or refreshing of cached Web content.
It is a further advantage of embodiments of the present invention to provide a system and method for the construction of URL, cookie, and database query mappings which are automated, and wherein the mapping is xe2x80x9cplug and playxe2x80x9d compatible in the software of the Web site architecture, without requiring re-booting, recompiling, etc.
It is a further advantage of embodiments of the present invention to provide a system and method for the construction of URL, cookie, and database query mappings which can be selectively applied to one or more servlets in an application server, or enabled/disabled on the fly, without otherwise disrupting the application server.
It is a further advantage of embodiments of the present invention to provide a system and method for the construction of URL, cookie, and database query mappings in systems in which the application server employs a multiple-threaded or multi-tasking operating system.
These and other advantages are accomplished according to a method employed within a content delivery system comprising a Web server and application server, wherein the Web server is coupled for receiving a user information request destined for an original servlet in the Web server. The user information request may comprise a URL request. The servlets are capable of calling one or more database applications, which includes databases and the databases of suitable DBMSs, or other programs or servlets. The method comprises redirecting the user information request to a wrapper servlet, which includes statements for extracting user information request identification information from the user information request, and assigning a job identification system variable containing the user information request identification information to the redirected user information request. The user information request identification information may include URL string and cookie information. The redirected user information request is then forwarded to the original servlet in the form of an HttpServletRequest.
The HttpServletRequest is then communicated to the application server, which issues at least one query destined for accessing a database application such as a database, an original database connectivity API in the DBMS, or other programs or servers. The at least one query includes the job identification system variable. The query is then redirected to a wrapper database connectivity API, which includes statements for recovering the user information request identification information from the at least one query and constructing a user information request identification information and query mapping. At any of the above-described steps, timestamp information may be captured for Web site profiling purposes.
These and other objects, features, and advantages of embodiments of the invention will be apparent to those skilled in the art from the following detailed description of embodiments of the invention, when read with the drawings and appended claims.