1. Field of the Invention
The present invention generally relates to distributed systems. More particularly, embodiments provide client-server systems for efficient handling of client requests.
2. Description of the Related Art
Generally, a distributed computer system comprises a collection of loosely coupled machines (mainframe, workstations or personal computers) interconnected by a communication network. Through a distributed computer system, a client may access various servers to store information, print documents, access databases, acquire client/server computing or gain access to the Internet. These services often require software applications running on the clients desktop to interact with other applications that might reside on one or more remote server machines. Thus, in a client/server computing environment, one or more clients and one or more servers, along with the operating system and various interprocess communication (IPC) methods or mechanisms, form a composite that permits distributed computation, analysis and presentation.
In client/server applications, a “server” is typically a software application routine or thread that is started on a computer that, in turn, operates continuously, waiting to connect and service the requests from various clients. Thus, servers are broadly defined as computers, and/or application programs executing thereon, that provide various functional operations and data upon request. Clients are broadly defined to include computers and/or processes that issue requests for services from the server. Thus, while clients and servers may be distributed in various computers across a network, they may also reside in a single computer, with individual software applications providing client and/or server functions. Once a client has established a connection with the server, the client and server communicate using commonly-known (e.g., TCP/IP) or proprietary protocol defined and documented by the server.
In some client-server implementations sockets are used to advantage. A socket, as created via the socket application programming interface (API), is at each end of a communications connection. The socket allows a first process to communicate with a second process at the other end of the communications connection, usually on a remote machine. Each process communicates with the other process by interacting directly with the socket at its end of the communication connection. Processes open sockets in a manner analogous to opening files, receiving back a file descriptor (specifically, a socket descriptor) by which they identify a socket.
Sockets and other client-server mechanisms are shown in the server environments 100 and 200 of FIG. 1 and FIG. 2, respectively. FIG. 1 illustrates synchronous processing and FIG. 2 illustrates asynchronous processing. In general, FIG. 1 shows server environment 100 comprising a main thread 102 and a plurality of worker threads 104. An initial series of operations 106 includes creating a socket (socket ( )), binding to a known address (bind ( )) and listening for incoming connections on the socket (listen ( )). An accept operation 108 is then issued to accept a new client connection, which is then given to one of the worker threads 104. The operations for accepting a new client connection and giving the client connection to a worker thread define a loop 110 which is repeated until the server is shut down.
Upon taking the client connection from the main thread 102 the worker thread 104 issues a receive operation 112. This operation is repeated (as indicated by loop 114) until the full request is received. The request is then processed and a response is sent using a send operation 116. A loop 118 causes processing to repeat the receive operations 112, thereby handling additional requests from the current client. The worker thread 104 may then take another client connection from the main thread 104 as represented by loop 120.
Alternatively, some server platforms provide a set of asynchronous I/O functions to allow the server design to scale better to a large number of clients. While these implementations vary across platforms, most support asynchronous read and write operations, and a common wait or post completion mechanism. The server applications provide buffers to be filled or emptied of data asynchronously. The status of these asynchronous I/O operations can be checked at a common wait or can be posted back to the application via some mechanism such as a signal. This I/O model can allow a pool of threads to scale to process a much larger set of clients with a limited number of threads in the server application's thread pool.
As an illustration, consider the server environment 200 which uses asynchronous I/O consisting of one main thread 202 accepting client connections and multiple worker threads 204 processing client requests received by the main thread 202. An initial series of operations 206 are the same as those described above with reference to synchronous processing (FIG. 1). Processing of a client request begins when the main thread 202 requests a connection from a client by issuing an asynchronous accept operation 208 for a new client connection to a pending queue 209. Each asynchronous accept operation 208 results in a separate pending accept data structure being placed on the pending queue 209. Once a client connection is established, the appropriate pending accept data structure is removed from the pending queue and a completed accept data structure is placed on a completion queue 210. The completed accept data structures are dequeued by the main thread 202 which issues an asynchronous wait for which a wakeup operation is returned from the completion queue 210. An asynchronous receive operation 214 is then started on a client connection socket 217 for some number of bytes by configuring the pending queue 209 to queue the pending client requests. The number of bytes may either be determined according to a length field which describes the length of the client request or, in the case of terminating characters, for some arbitrary number. Each asynchronous receive operation 214 results in a separate pending receive data structure being placed on the pending queue 209. When a receive completes (the complete client record has been received), the appropriate pending receive data structure is removed from the pending queue 209 and a completed receive data structure is placed on the completion queue 216. An asynchronous wait 218 is issued by a worker thread 204A for which a wakeup operation 220 is returned from the queue 216 with the data.
In the case where a length field is used, the specified number of bytes from the length field is used by the worker thread 204A to issue another asynchronous receive operation 222 to obtain the rest of the client request which is typically received incrementally in portions, each of which is placed in an application buffer. The second asynchronous receive operation 222 is posted as complete to the queue 216 upon receiving the full request and the same or another thread from the thread pool 204 processes the client request. This process is then repeated for subsequent client requests. Where a terminating character(s) is used, each incoming request is dequeued from the queue 216 and checked for the terminating character(s). If the character(s) is not found, another asynchronous receive operation 222 is issued. Asynchronous receive operations are repeatedly issued until the terminating character(s) is received. This repetition for both length field and terminating character implementations is represented by loop 224 in FIG. 2.
Sockets receive data from clients using well-known “receive” semantics such as readv ( ) and recvmsg ( ). The receive semantics illustrated in FIGS. 1 and 2 are receive ( ) and asyncReceive ( ) respectively. Sockets receive semantics are either synchronous (FIG. 1) or asynchronous (FIG. 2). Synchronous APIs such as readv ( ) and recvmsg ( ) receive data in the execution context issuing the API. Asynchronous APIs such as asyncRecv ( ) return indications that the receive will be handled asynchronously if the data is not immediately available.
Synchronous receive I/O will wait until the requested data arrives. This wait is typically performed within the sockets level of the operating system. During this wait, a buffer supplied by the application server is reserved until the receive completes successfully or an error condition is encountered. Unfortunately, many client connections have a “bursty” data nature where there can be significant lag times between each client request. As a result, the buffers reserved for the incoming client requests and can typically sit idle while waiting for client requests to be received. This can cause additional storage to be allocated but not used until the data arrives, resulting in inefficient use of limited memory resources. Further, where multiple allocated buffers are underutilized, system paging rates can be adversely affected.
Asynchronous I/O registers a buffer to be filled asynchronously when the data arrives. This buffer cannot be used until the I/O completes or an error condition causes the operation to fail. When data arrives, the buffer is filled asynchronously relative to the server process a completed request transitions to a common wait point for processing. While advantageous, this asynchronous behavior suffers from the same shortcomings as the synchronous receive I/O into the buffer supplied is reserved until the operation completes and an indication is returned to the application server. As a result, the storage and paging concerns described above with respect to synchronous receive I/O also applied to asynchronous I/O processing.
In summary, synchronous and asynchronous I/O suffer from at least two problems. First, the multiple buffers reserved at any given time are more than what are needed to service the number of incoming requests. As a result, the memory footprint for processing is much larger than needed. Second, memory allocated for each incoming requests will consume this valuable resource and cause memory management page thrashing.
To avoid the foregoing problems, it is desirable to acquire a buffer large enough to hold all of the data when it arrives. Such an approach would keep the buffer highly utilized from a memory management paging perspective. However, one problem with this approach is determining what size buffer an application server should provide when the I/O operation is initiated. This problem arises because the record length is contained within the input data stream and will only be known when the data arrives. One solution would be to code the application server for the worst possible case and always supply a buffer large enough to accommodate the largest record possible. However, this would be a waste of resources and could adversely affect the paging rates not only for the server, but the system itself.
Therefore, a need exists for efficiently allocating buffers for client requests.