The present invention relates generally to communication networks, and, more particularly, to protocols for establishing connections in a communication network.
The HyperText Transfer Protocol (HTTP) is an application-level request/response protocol for distributed hypermedia information systems. HTTP is used by the World Wide Web by clients, typically browser applications like Mosaic or Netscape Navigator, to retrieve information from remote servers. HTTP relies on a reliable transport protocol to send data over the Internet. The Transmission Control Protocol (TCP) is the most widely used transport protocol in conjunction with HTTP. To download a Web page, for instance with reference to FIG. 1, the client 110 must use a three-way handshake in order to initiate a TCP connection. The client 110 sends a segment of data 131 requesting a connection to the server; the server 120 responds to the request 132; and the client must acknowledge the server""s response 133 before a connection is established, as set forth in FIG. 1. The server 120 responds 151 to the HTTP request 134 by sending the base page and links to the embedded objects. The client 110 then requests separately the embedded objects using the links.
In HTTP/1.0, the version of HTTP used by most Web browsers and servers today, each client request is serviced using a new TCP connection. To speed up the data transfer, current Web browsers establish several (usually four) parallel TCP connections to the server. This practice has been shown to have negative effects on the already congested Internet. The inefficiency of HTTP/1.0 has been attributed to a number of problems: (1) the connection setup time 130 (two network round trip delays) and server processing time 140 required prior to each data transmission; (2) the use of xe2x80x9cslow startxe2x80x9d congestion-avoidance algorithms in TCP reducing the throughput; (3) the overhead imposed on servers by excessive opening and closing of connections rendering the server artificially busy; and (4) the depletion of available sockets, especially problematic given the time-wait state imposed by TCP which can cause busy servers to quickly run out of sockets.
The new version of HTTP, HTTP/1.1, is designed to overcome some of the problems of HTTP/1.0. See R. Fielding et. al., xe2x80x9cHypertext Transfer Protocolxe2x80x94HTTP/1.1xe2x80x9d, RFC 2068, Network Working Group, 1997; R. Fielding et. al., xe2x80x9cHypertext Transfer Protocolxe2x80x94HTTP/1.1xe2x80x9d, RFC 2616, Network Working Group, 1999. HTTP/1.1 is aimed at improving the throughput and the delay as perceived by a client, while decreasing the server load, by allowing a TCP connection to persist beyond the lifetime of a single request-reply cycle. Multiple requests from a client can be pipelined on the same TCP connection resulting in higher throughput and shorter response time. The downside is that each persistent TCP connection requires a socket at the server. Since the number of sockets at the server is limited, keeping connections open for long periods of time may prevent potential clients form connecting to the server if no sockets are available. Keeping sockets open also incurs server overhead to keep track of each data structure. There is a tradeoff between the number of clients a server can serve and the throughput and delay perceived by each client.
The standard specification for HTTP/1.1 leaves the decision on when to close a persistent connection to the implementer. The standard suggests the use of a fixed timeout period, after which a server automatically closes a connection. Other suggested solutions include using a quasi-dynamic solution where two fixed timers are usedxe2x80x94a short timeout period (of one second) when the server receives the first request, and a maximum timeout value if the server receives a second request prior to the timeout. Others have suggested closing the connection after servicing a fixed number of requests. Version 1.3.1 of the Apache Web server uses a combination solutionxe2x80x94a connection is closed if it has been idle for 15 seconds or 100 requests have been serviced on the connection, whichever occurs first.
None of these solutions take into account a factor that can drastically change the efficacy of a given timer value: namely, the variability of the server load during different hours of the day and different days of the week.
The present invention discloses a method of improving the performance of a server enabled to permit connections to clients to persist for a duration equal to a timer value, such as Web servers utilizing HTTP/1.1. In accordance with an embodiment of the present invention, the server estimates the load on the server and uses the estimate to modify the timer value. The timer value can be chosen to balance the need to increase the throughput as seen by the clients and the server need to service the largest possible number of clients without running out of resources. The timer value can be set to a longer value when the server load is light and a shorter value when the server load is heavy. In a preferred embodiment of the present invention, the server dynamically selects the largest timer that guarantees that the server does not run out of resources under the current measured load.