Computer systems establish connections between entities over networks. One type of network connection is a client-server connection in which a client requests information from a server. For example, the client may be a browser application being used by a person on a general personal computer, and the server may be another computer that provides electronic documents to the client. The electronic documents may be Web pages that the user requests from the server over the worldwide packet data communication network now commonly known as the Internet. The World Wide Web may be defined as all of the resources and users on the Internet that communicate using the hypertext transfer protocol (HTTP).
Each such connection established between the serer and the client uses resources provided by an operating system at the server. For example, for each connection, the operating system may associate a connection socket file descriptor, a heap segment, and a request handling thread.
Request handling threads, also known as worker threads, are an example of a processing resource that is used by a server to service a connection. Processing resources include other types of resources, such as acceptor threads, that help to establish connections on the server side. Processing resources are often part of a larger processing resource, such as a daemon. Daemons are programs that run continuously on a server and exist to handle requests for service from clients and the users associated with the clients.
As used herein, the term “connection” refers to a socket abstraction that has all the information relevant to the communication path between a server and a client, such as the transmission control protocol/Internet protocol (TCP/IP) addresses of the server and client, port numbers, etc. As used herein, “establishing a connection” means accepting a connection on the server side by assigning a file descriptor that is associated with the TCP layer to a processing resource. The file descriptor is uniquely associated with an operating system data structure with the information relevant to the communication path. Although a connection is associated with a file descriptor, a connection is not a file.
Originally, the World Wide Web was intended to be “connectionless” meaning that after a processing resource services a request from a client, the processing resource is made available to handle additional requests that may come from the same or different clients. To the extent that a connection is established between the client and the server in such a connectionless system, the connection exists only as long as necessary to process the request.
In contrast, a connection-based system establishes a connection between the client and the server and associates a particular request handling thread with the connection. All requests received over the connection are processed by the particular request handling thread as long as the connection is open, and any requests received over other connections are processed by other request handling threads. When the connection is closed, the particular request handling thread is made available to be associated with another connection. Such connections are referred to as persistent connections because the connection persists to process additional requests on the same connection after the current request is processed. With a connectionless approach, the request processing resource automatically disassociates the connection by closing the connection after servicing the request. Thus, only one request per connection is serviced.
Establishing connections consumes the resources of the operating system at the server. For example, before exchanging data between the client and the server, a three-step handshake protocol is typically followed, which consumes processing time and other resources from the server. The more resources the server devotes to following the handshake protocol for new connections, the fewer resources are available for the server to process requests received over connections.
In addition, when a client interacts with a server, there are often numerous separate requests between the client and the server. For example, the client may request a Web page that contains objects, or links, to several graphics files that must be requested from the server using additional requests. As another example, a user visiting a Web site may visit numerous Web pages associated with the site in rapid succession, and each Web page is retrieved by the client from the server in a separate request. Generally in a connectionless system, the three-way handshake protocol is performed prior to servicing each of these many requests. When there are numerous requests over a short time between a particular client and a particular server, resources are wasted in repeatedly executing the handshaking protocol between the same client and the server.
As a result, there has been a trend during the development of HTTP to provide for persistent connections to avoid or minimize connection establishment and tear-down overhead. For example, in HTTP Version 0.9, there is no capability to establish a persistent connection between a server and a client. All interactions under HTTP Version 0.9 are connectionless. In HTTP Version 1.0, the connectionless approach is still the default approach, but a connection may be defined to be persistent, such as by using the HTTP header mark-up “Keep-Alive” by the client. For such persistent connections, the server maintains the connection for a predefined length of time. In HTTP Version 1.1, the default approach to establishing connections is to have persistent connections. The persistent connections are closed when the server decides to close the connections, such as following a predefined period of time, or when the client or the server requests that the connection be closed. Thus, the trend towards connection-based communications between clients and servers avoids the unnecessary expenditure of resources that would be required in a connectionless system when there are multiple requests between a client and a server. However, connection-based communications may result in processing resources being idle while waiting for additional requests over the connection, as explained further below.
When a client sends a request for a connection to a server, the request is handled by a daemon. The daemon includes different types of processing resources for servicing connections and generating responses to the requests received over the connections. One type of processing resource is an acceptor thread, which is used for accepting a connection request, establishing the connection at the server, and placing the connection into a work queue to await processing. The work queue has one or more places or slots for holding connections. The daemon may include worker threads that service the connection by picking up a connection from the work queue, receiving requests sent from the client to the server over the connection, and generating responses to the requests.
If the server implements a synchronous input/output (I/O) model, such as in the Unix operating system, each connection is serviced by one worker thread that is dedicated to the connection. The worker thread receives all of the requests from the client over the connection and generates responses for each request. The worker thread remains associated with the connection until the connection is closed. While the worker thread is associated with the connection, the worker thread does not service other connections.
The synchronous I/O model presents an efficiency problem when connections experience “idle” time. A connection may be idle while waiting for the user at the client device to submit another request, such as by selecting another link on a Web page. Idle connections may cause inefficiency because the worker threads assigned to the idle connections are not performing any work, and the number of worker threads that can be supported by the daemon is limited. It is possible that a server may have all available worker threads dedicated to connections, many of which may be idle, yet there are additional connections waiting to be serviced, thereby impacting the ability of the server to efficiently service the connections.
FIG. 1 is a block diagram that illustrates a daemon that services connections. Although FIG. 1 illustrates a limited number of types of elements and a few examples of each element type, in practice numerous additional element types and elements are included. However, for purposes of simplifying the following explanation, only a few representative examples are illustrated. In addition, the elements shown may be implemented by software, hardware, or a combination thereof, on one or more physical devices.
FIG. 1 shows a daemon 110 that is responsible for servicing connections 120, 122, 124 between clients 170, 172, 174 and a server 112. A network 160, such as the Internet, provides the means for communicatively coupling clients 170, 172, 174 to server 112. Daemon 110 includes acceptor threads 130, 132, 134 that are responsible for accepting connections 120, 122, 124 at server 112 and passing connections 120, 122, 124 to a work queue 138. Work queue 138 includes a number of slots or positions for holding connections that are waiting to be serviced, of which slots 140, 142, 144 are shown. Connections 120, 122, 124 are passed by acceptor threads 130, 132, 134 to slots 140, 142, 144, respectively.
Work queue 138 holds connections 120, 122, 124 until a worker thread is available to service each of connections 120, 122, 124. Daemon 110 includes worker threads 150, 152 for servicing the connections in work queue 138. For example, worker thread 150 picks up connection 120 from slot 140 and process the requests received from client 170 over connection 120. Similarly, worker thread 152 picks up connection 122 from slot 142 and process requests received from client 172 over connection 122. Because the example of FIG. 1 only has two worker threads, connection 124 must wait in slot 144 of work queue 138 to be serviced. Thus, when using the synchronous I/O model, once connection 120 is closed, worker thread 150 is available to service another connection, such as connection 124.
In the example illustrated in FIG. 1, there are only two worker threads, yet there are three connections. If worker thread 150 is servicing connection 120 and worker thread 152 is servicing connection 122, then connection 124 must wait in slot 144 of work queue 138 until one of the two worker threads becomes available. With the synchronous I/O model, worker thread 150 remains dedicated to connection 120 until connection 120 is closed, and worker thread 152 remains dedicated to connection 122 until connection 122 is closed. If connections 120, 122 are frequently idle, then server 112 is using all of the available worker threads to do little or no work while a third connection, connection 124, sits waiting with one or more requests to be serviced.
One approach for addressing the efficiency problem of the synchronous I/O model is to implement an asynchronous I/O model at the server. With an asynchronous I/O model, each connection may be serviced by many threads because processing resources, such as worker threads, are not dedicated to connections until the connections are closed. Instead, each request is independent of the other requests, even if the connection is persistent. For example, a first worker thread processes the first request received over a connection and generates a response. Once the first request is serviced, the worker thread is made available to service another connection in the work queue. When a second request is received over the connection, the connection is added to the work queue to await servicing by another worker thread. While it is possible that the same worker thread services the second request, such an occurrence is a product of chance instead of a feature or characteristic of the asynchronous I/O model.
For example, in FIG. 1, if the I/O model is asynchronous, worker threads 150, 152 are only associated with connections 120, 122, respectively, during the servicing of requests as long as connections 120, 122 remain active. If worker thread 150 is servicing connection 120 and connection 120 becomes idle, worker thread 150 is made available to service another connection, such as connection 124. If another request is received over connection 120, connection 120 waits in work queue 138 until another worker thread is made available, say worker thread 152 after connection 122 becomes idle.
A problem with using the asynchronous approach to solve the worker thread efficiency problem that arises with the synchronous approach is that not all servers implement the asynchronous I/O model. Some types of servers and operating systems use the synchronous approach for other reasons, and thus incur the inefficiency problem described above.
Based on the foregoing, there exists a need for a mechanism for servicing connections that minimizes the resources required when there are idle periods for the connections.