1. Technical Field
This invention relates to systems and methods for managing load levels of web server systems that generate and personalize web pages dynamically.
2. Description of the Related Art
The term “load” is commonly used to describe how much of a computing device's or system's resources are being used. These resources can include, for example, processing capacity, random access memory, incoming and outgoing communication bandwidth, and/or disk input/output (I/O) capacity. Operating systems commonly generate a number of different parameters indicative of the current load on a system.
A high load on a computing device typically means that some or all of the resources are being fully or almost fully utilized. A low load typically means that there are sufficient resources available to handle additional tasks. As the load on a computing device increases, performance in handling tasks generally suffers. When load exceeds certain critical levels, response times can degrade precipitously.
A web site is typically hosted on a server system which can include one or more computing devices. A low traffic web site, for example, can typically be hosted on a single server computer. A very high traffic web site, for example, will typically include multiple computing devices such as load balancing computers, web server computers, application server computers, and database server computers. The load on such a system can be specified in terms of the loads on the individual physical computing devices that make up the system.
The load on a web server system is affected by a number of factors, such as the number of web page requests being handled simultaneously, the rate at which new requests are being received, and the amount of processing and memory required to handle each request. In order to maintain acceptable user response times, well-maintained web sites have historically been hosted on systems that have sufficient excess capacity to handle peak loads. When the host system is lightly loaded, the excess capacity is unused.
In certain instances, the popularity of a web site increases unexpectedly, and the entity hosting the site does not have the ability (e.g. funds or time) to add server capacity to respond to the increasing loads. In these situations, the site's servers can become overloaded. As a result, wait times for requests can become unacceptable, and some requests may be dropped altogether without a response. When requests are dropped or when wait times become longer than several seconds, users' perceptions of a web site can be adversely affected.
In a paper titled “Reading Course Paper Overview of Internet QoS and Web Server QoS,” (Department of Computer Science, The University of Western Ontario, London, Ontario, Canada, Apr. 6, 2000), Nikolaos Vasiliou surveys several application level systems designed to handle peak server loads when serving page requests. The systems surveyed generally propose varying the priority with which requests are handled in order to guarantee reasonable response times for high priority requests. Most of the systems described in the paper prioritize requests based on factors such as how much a web hosting customer is paying for the hosting of a requested web page. One of the systems prioritizes requests based upon the identity of the user requesting the page. These systems, however, end up favoring the high priority requests at the expense of lower priority requests. As a result, when loads increase, lower priority requests are more likely to be delayed or dropped.
Systems that address load problems solely by prioritizing some requests over others are unacceptable in certain contexts. For example, in many environments, long server response times and dropped page requests can result in a loss of customers. The present invention seeks to address this problem, among others.