In a computer network, client computers establish connections to servers in order to communicate, e.g. a client computer establishes a connection to a server over the Internet in order to download a web page. Each client connection to the server uses a portion of server resources such as processing time and memory even when the connection is idle, that is, when there is no data being transmitted over the connection. A server typically handles large numbers of connections, and because each connection consumes server resources, the overall server response can degrade when there are many connections to handle.
In order to maintain server efficiency and effectively respond to client requests, servers generally have processes for handling connections. A typical process includes a procedure for closing connections after the connections are no longer useful. One conventional process in servers for handling connections includes a policy to drop a connection as soon as a response to a client request is transmitted. For example, where the client makes a request, the client and server establish a connection, the server provides the data in a response to the client and then the server closes the connection.
Newer computer protocols, such as Hypertext Transport Protocol (HTTP) 1.1 and above, allow persistent connections, that is, connections that do not close immediately after a server response, but instead remain in place in order to enable additional transactions between the client and server without the step of reestablishing a new connection for each additional transaction. For example, the HTTP 1.1 GET request includes a keepalive, which is a directive to maintain the client-server connection. A keepalive typically includes a time-out period after which the server may drop the persistent connection. Persistent connections have also been implemented in order to accommodate data communications devices in computer networks that maintain connection states between clients and servers. For example, load balancing devices direct client connections to particular servers in a group of servers according to a policy aimed at balancing load among the servers in the group. Persistent connections enable a client to establish a first connection to a load balancing device and then enable the load balancing device to establish a second connection from itself to a server.
In some ways, persistent connections increase server efficiency because server resources are not spent reestablishing connections with clients in order to conduct additional transactions. For example, in downloading a web page having many connections, which is typical in a graphical web page, a persistent connection to the server enables the client to make multiple requests over the same connection in order to get a complete web page over the one connection rather than having to establish a new connection for each request. Thus, persistent connections significantly improve download time for graphical web pages.
Persistent connections generally tend to accumulate at the server. One of the reasons that idle connections may accumulate at the server is that client web browsers typically allow up to four connections to servers at one time. A client could, for example, establish four persistent connections to a server and then move on to a different server using one of the four connections, leaving behind three persistent connections until a connection time-out period has elapsed within the server still maintaining the three connections.
Persistent connections can make load balancing difficult. For example, if a client sends a second request on an existing connection to a first server through a load balancing device, but the load balancing device determines that another server should service the second request, the load balancing device needs to have a process to enable the client to communicate with the other server. There are some conventional solutions to this problem.
In a first conventional system, a data communications device, such as a load balancer, maintains connections between a client and a server. In this conventional system, the data communications device drops the connection to the server when the client moves to a different server. That is, the data communications device establishes a connection with a server in response to a client request. The client then makes a second request that includes connecting to another server, the data communications device closes the connection to a first server established for the first request and establishes a new connection to the other server. This system reduces the accumulation of idle connections to servers.
In a second conventional system, a data communications device maintaining connections with both clients and servers, such as a load balancing device, reuses existing connections. That is, the data communications device establishes a set of persistent connections to multiple servers in response to prior client requests. When a current client requests a different server or a new client requests a connection to one of the servers, the data communications device preferably uses an existing idle connection to enable the client to communicate with the requested server. The data communications device creates a new connection when there are no available connections to the requested server.
This system diminishes the number of new connections from the data communications device to the servers (the “back side”) that would otherwise be created to handle client requests. There could still be many client connections between clients and the data communications device. The system therefore enables the data communications device to conserve back side connections and to conserve server resources. One alternative embodiment of this conventional system includes a limit to the number of back side server connections, for example to one hundred connections.