The invention relates to network servers. More specifically, the invention relates to methods by which a network server processes client requests.
The World Wide Web is experiencing a phenomenal growth. Many companies are deploying web servers, with some web sites becoming extremely popular. Many web sites now offer web content that is quite diverse, ranging from news to entertainment to advertising to electronic commerce. No end to this growth appears to be in sight.
A web site may be accessed by a “client” such as a personal computer running a web browser program that is capable of connecting to the Internet. Referring to FIG. 1, a user enters a Uniform Resource Locator (“URL”) of the web site into the client (10). As a result, the client attempts to make a connection with the server (12 and 14) and, if successful, sends a client request (e.g. for web content) to the server (16 and 18). Under ideal conditions, the client receives a response back from the server and displays the requested web content to the user (20 and 22).
A popular web site might receive large bursts of client requests at any given time. A high-performance, high-capacity HTTP (a.k.a. “web”) server is typically configured to process a limited number of these requests concurrently. Any additional pending requests (and their associated connection information) is temporarily buffered by the server in its “listen queue”. The capacity of the listen queue is often set to a large value to accommodate bursts of traffic and to accept as many client requests as possible. In servicing client requests, web applications running on the web server might provide both static and dynamic content, performing complex database queries and other data-manipulation. These can lead to a large variance in request service times. Congested and overloaded Internet routes only add to this variance: the download time for a given document can range from 5% to 500% of its typical latency.
Long delays typically cause users to cancel and possibly resubmit their requests. Such user behavior is also illustrated in FIG. 1. If a user experiences a long response delay after sending a request to a web server, that user might exhibit one of the following behaviors: “patient behavior” whereby the user waits patiently for the response no matter how long it takes; “anxious behavior” whereby the user is anxious to receive the response and clicks the browser “stop” button (24 or 26), followed by the browser “reload” button to resend the request (10 and 12); and “impatient behavior” whereby the client is not tolerant of the delay, clicks the browser “stop” button (28 or 30) and leaves the site (32), loosing interest in the content because it took too long to receive.
Since timed-out requests are not removed from the listen queues of current web servers, their processing could lead to a substantial expenditure of server resources. Although patient behavior is the most desirable behavior from the standpoint of preserving web server efficiency, it is not the most typical client behavior. More often, clients exhibit the anxious or impatient behaviors. Consequently, an overloaded web server could end up processing a lot of “dead”, timed-out requests. While the web server is processing these dead requests, it is expending its resources on useless work instead of devoting its resources towards “still-vital” requests.
Those still-vital requests at the end of the listen queue encounter ever-longer delays that exceed the patience threshold of the client making the request. The server would be “busily” processing only dead requests and would not be doing any useful work. This could create a snowball effect in which all requests time out before being serviced. This pathological system state will be referred to as “request-timeout livelock.” Request-timeout livelock is more likely to occur in listen queues having large capacities.
In practice, request-timeout livelock is not easily recognizable. Typically, a server detects that the client request has timed out by sending a response to the client. If the client has timed out, the client typically receives a first packet of the response and returns an error message to server, the message explicitly notifying the server that the connection between the server and the client is closed. After processing the error message, the server stops sending the remaining packets of the response. However, the server work involved in preparing the response was not avoided. Server resources were still expended on preparing useless responses and the chance of request-timeout livelock still existed.
There is a need to improve efficiency of web servers so they do not expend resources by processing timed-out requests. There is also a need to protect against request-timeout livelock.