Web 2.0 applications are increasingly more prevalent, as are Web 2.0 applications that can interface with asynchronous updates. These Web 2.0 applications can emulate asynchronous notification over the HyperText Transfer Protocol (HTTP), a synchronous request and response protocol, can impose additional load on servers and. Since HTTP is a synchronous request and response protocol, implementations emulate asynchronous notification over HTTP. However, these implementations have one or more drawbacks that impose increased load on servers and challenges with scalability for multiple users.
One such implementation involves the client intermittently polling the server to discover if it has any data queued for delivery to the client. This approach suffers from several drawbacks. In particular this approach can waste network bandwidth by polling when no data is available; reduce the responsiveness of the application since the data spends time queued waiting until the server receives the next poll (HTTP request) from the client, and increase the load on the server multi-fold, since the server must validate and respond to each poll request, sometimes referred to as “hammering the server.” Together, these drawbacks can result in an inevitable trade-off between responsiveness and bandwidth—since increasing the polling frequency will decrease latency but simultaneously increase bandwidth consumption and server load (or vice versa if polling frequency is decreased).
Another implementation involves techniques such as long-polling and streaming-response (popularized as forever-frame.) These techniques enable servers to push notifications to clients instead of the client polling for updates. Updates from the server may be pushed using the long-polling or streaming response technique. In long polling, clients periodically poll for an update by issuing an HTTP request. If no update is available, the server holds the connection open, but does not respond. As soon as the client receives a response from the server, or if the request times out, the client sends another request. This ensures that the server is (almost) always holding a request that the server can use to “push” data to the client. With the streaming response technique, a client employing streaming opens a long-lived HTTP connection to the server like with the long-polling technique. However the server does not close the response when the server has no further updates to send to the client. Instead when the response is formed using the HTTP chunk-encoding mechanism, the server can keep posting additional chunks as and when new updates become available and while keeping the connection open. With either approach, the server can achieve low latency and low bandwidth. because an open connection with the client is always available to the server.
As the number of clients grows, the need for a server to keep a connection open for each client becomes a challenging requirement. These large numbers of connections terminating on the server require kernel resources and memory for data structures like protocol control blocks, socket descriptors and socket buffers. Most operating systems are not equipped to deal with a large number of open sockets, and keeping many connections open adversely affects both the performance and scalability of the servers.