The present invention relates to computers and, more particularly, to “host” computer networks that respond to requests from external “client” computers. A major objective of the present invention is to enhance the quality of service associated with a host site on the World Wide Web.
Much of modern progress is associated with the development of computers and the Internet, which permits computers to communicate worldwide. Several protocols exist by which a host site, which may comprise one or more “server” computers, receives and processes messages from a number of other computers (“clients”). For example, HTTP (HyperText Transfer Protocol) is a communications protocol used on the World Wide Web to enable users to navigate within and among host sites.
Messages can usually be grouped into sessions, with each session having one or more related messages. For example, a session can consist of a message requesting information over the World Wide Web, and an associated response. Alternatively, a multiple-message session can consist of a commercial transaction, with related messages respectively used to locate within a web site a precise product, submit an order or billing and shipping information, and convey a confirmation of sale to a particular client.  Whether a host is to process just a single message or a series of related messages, it is usually important to quickly, accurately, and completely service each message and each session.
The term “quality of service” refers to a host's ability to respond quickly to individual messages and to complete sessions. As a particular host becomes more popular, and therefore receives more messages, the host's processing resources can become stretched. For example, due to heavy traffic, a host may not be able to respond to a message at all, or the host may not provide a timely response (which can cause a client to “time-out” with an error or the impatient user to resend the message). Poor quality of service can have significant impact, as users may become frustrated and simply give up trying to reach a particular host, and the sponsor of the host may lose sales or fail to communicate needed information to some clients.
Quality of service can be improved by adding processing capacity and by implementing admissions control. Different approaches to increasing capacity are discussed further below. In many cases, it is cost effective to provide sufficient capacity to handle all messages most of the time, while relying on admissions control for peak demand situations. Even where additional hardware resources can be added on demand, peak usage can occur too suddenly for the additional capacity to be operational; the time from the spike being identified to the time additional capacity is available can be hours to days for manual operations and many minutes to hours for automatic or semi-automatic operations. During this time, admission control is often the best alternative to handling an immediate spike in usage, especially when the spike is short lived and far higher than the site's demand growth trend. 
Admissions control involves selectively admitting and rejecting messages so that the messages that are processed are handled promptly and reliably. Intelligent admissions control can prioritize messages to advance the purposes of the host site. For example, messages associated with ongoing sessions can be given priority. U.S. Pat. No. 6,006,269 to Phaal discloses a system in which admission determinations are made on a session basis and in which non-admitted messages are deferred with a higher priority level rather than rejected outright. Thus, opportunities associated with non-admitted messages are not necessarily lost.
Preferably, the admissions decisions are based on utilization data gathered by the host site. All messages can be admitted when resource utilization is low. As it increases to potentially problematic levels, admissions can become increasing selective. However, since there is a potential detriment to the site whenever admissions control rejects or defers a message, admission control is not desirable on a steady-state basis. It is preferable that host-site capacity be able to process all messages most of the time
In general, quality of service can be improved by adding processing power to the host site. For example, in a single-server site, a less powerful server can be replaced with a more powerful server. An advantage of the single-server replacement approach is simplicity. A single computer handles all site functions. Programs running on that computer can handle message, monitor resources, and administer admissions control.
On the other hand, there are limits to the single-server approach. At the high end, where the limits of available technology are pushed, fractional increases in power are quite costly. Furthermore, when the single server is down, the entire site is down.  Also, a replacement strategy can be wasteful if the replaced server is no longer used. In practice, most large sites use multiple servers.
Similarly configured severs can be arranged in parallel. Servers can be added in parallel as needed to increase capacity. A load-balancing mechanism can be added at the front end to distribute messages among the parallel servers. However, the parallel servers need to be coordinated to ensure session integrity. Also, while admissions control can be done independently by each server, this results in messages being rejected by one server while there is ample capacity on another. It is not practical for the parallel servers to communicate with each other regarding each client request. While the coordination can, in principle, be performed at the load balancer, imposing additional processing requirements on the common node for all messages can result in an unacceptable performance bottleneck.
A tiered host site overcomes many of the problems facing a multi-server host site. A typical tiered host site has a client-response tier (e.g., web tier using HTTP protocol) and an application tier. (Conventionally, the client computers constitute a tier; thus, a two-tier host site would imply a three-tier system.) The application tier performs most of the resource-intensive work regarding the purpose of the site, while client-response tier typically provides the client interface. For example, the client-response tier can provide a web interface for a client, while the application tier handles e-commerce (electronic commerce) applications and maintains a database accessed by the applications. The application tier can manage sessions, tagging responses so that subsequent messages in the session can be identified as such. The client-response tier can use the session tags to generate cookies and/or links associated  with session-specific URLs to direct subsequent client requests to the proper application server.
Each tier can be expanded independently as required. Since the client-response tier manages the distribution of messages to the application tier, application tier servers can be differentiated by function. Capacity can be increased by activating (adding, turning on, or re-allocating) a server dedicated to the stressed function so that the additional resources are not wasted on underutilized applications.
The client-response tier can utilize parallel similarly configured servers. Coordination among the parallel servers is facilitated by the session tags added by the application tier. Expansion of the client-tier capacity is then readily achieved by adding additional servers in parallel. Alternatively, the client-response tier can be configured as a load-balancing hub supported by other servers dedicated to specific client-response functions, such as encryption and decryption. While the hub approach does place additional burdens on the common-node load balancer, session tracking is still managed primarily at the application tier. A hubbed client-response tier can be expanded efficiently by adding servers dedicated to a specific function that is over utilized.
In a tiered site, the admissions control function is typically assigned to the client-response tier since it provides the front-end interface to the client computer. The admission control function can monitor local resource utilization effectively on a per-server basis. The admission control function on the client-response tier can monitor response times associated with requests to the application tier as a measure of its resource utilization. The resource utilization information about the client-response tier and  the application tier then is used to determine the admissions control policy at any given time.
While the tiered host site approache provides for efficient scaling and for effective admissions control, there is an insatiable demand for better performance. In particular, there is a demand for better admission control, since it is a software component that can, in principle, be upgraded less expensively than the host site hardware. What is needed is more effective admission control for a tiered host site.