Web browsers retrieve data, such as information resources including a web page, an image, a video, or other piece of content, from a web server. The web browser first transmits one or more web requests, such as a hypertext transfer protocol (HTTP) GET request, to the web server, wherein the web request(s) correspond to a request to receive data. Upon receiving the web request(s), the web server can transmit the requested data to the web browser such that the web browser may display the results on a computer or other type of internet-enabled device that supports the web browser. In order to prevent the web server from becoming overloaded with too many concurrent requests, the web browser can limit the number of allowed simultaneous concurrent requests. For example, some web browsers have a limit of between four to eight simultaneous concurrent requests that can be sent to a web server. However, when the limit for concurrent requests is reached, the web browser can block requests until one of the pending request(s) are completed.
Due to the limit for concurrent requests, various techniques may be used to reduce the latency for retrieving information from a web server. One technique is to have the web browser establish multiple connections using GET requests that are queued on the server side until new information is ready to be transmitted. U.S. Patent Application Publication 2012/0324358, published Dec. 20, 2012 and incorporated herein by reference in its entirety, describes in detail an exemplary protocol for remoting a GUI from a VM to a client web browser via HTTP GET requests. Another technique is to use a chunked-encoding mechanism, such as COMET. For example, a long-lived HTTP connection may continuously push new data to the web browser from the web server via COMET. These low latency mechanisms generally rely on the client sending multiple GET requests, up to the allowed limit of concurrent requests, to the server for data, which the servers queue until data becomes available for sending to the client, whereupon the data is matched to one of the queued GET requests and returned to the client as a response to the GET request. Once the client receives the response, it immediately issues a new GET request. This differs from high-latency techniques wherein the server does not proactively seek the data with the expectation of having a GET request with which to match the data when the data is available. Rather, in the more traditional high-latency approach, the client periodically sends a request for data, which the server fetches only after receiving a GET request for the data.
However, there are limitations to using low latency techniques. Each of the aforementioned low latency techniques consumes at least one HTTP connection from the maximum number that is allowed between the web server and a desktop of a user for an indefinite period of time. For example, most web applications can be accessed from multiple instances of a web browser or provide some functionality via web browser window pop-ups. Each web browser instance accessing the remote web application will consume at least one connection towards the limit of available connections, and the limit is enforced as an aggregate number across all open web browser instances running on the same desktop. As such, web applications that use low-latency techniques, such as described in the patent application publication incorporated above, dead-lock when several browser windows or tabs are opened because all connections become blocked and never released.