The present invention relates to communication in a client-server computer network and, in particular, to techniques for reducing user-perceived latency when a user requests a resource from a server.
Communications between computers in a network environment are handled by a common protocol. For example, with reference to FIG. 1, World Wide Web clients 110 and servers 120 on the Internet communicate utilizing the hypertext transfer protocol or HTTP. Underlying HTTP, communications are established and requests and responses are sent using connections based on the transmission control protocol or TCP. TCP is part of the TCP/IP protocol suite, which allows computers of different sizes running completely different operating systems to communicate with each other. TCP is a connection-oriented protocol: in other words, a connection must be established between two computers before data can be transferred. TCP uses a three-way handshake in order to initiate a connection: the client 110 sends a segment of data 201 requesting a connection to the server; the server 120 responds to the request 202; and the client must acknowledge the server""s response 203 before a connection is established, as set forth in FIG. 2. Likewise, a formal procedure is required to terminate an open connection between the client and the server.
It is important to reduce delays from the time a resource is requested in a computer network to the time it is received. Such delays are referred to as user-perceived latency. For example, when a user issues an HTTP request over the Internet, there is a delay before the user receives the requested resource (e.g. a Web page). This delay has several components: (1) the round-trip time (RTT) required to establish the TCP connection; (2) the TCP xe2x80x9cslow startxe2x80x9d; and (3) the transmission time. The TCP xe2x80x9cslow startxe2x80x9d refers to the aspect of TCP whereby data is initially transferred at a low transmission rate and the transmission rate is doubled until bandwidth limits are reached (i.e. packets are lost or the connection speed saturated) and the transmission rate stabilizes. The connection-setup RTT, the TCP slow start period, and the transmission time all introduce a latency which the user experiences as a delay before receiving the requested resource. When the TCP connection is closed after receiving each requested resource, the above delay adds up and further reduces service quality.
Several techniques have been devised to address user-perceived latency. For example, two techniques form an integral part of the new HTTP 1.1 protocol (the techniques can also be deployed with the older HTTP 1.0 protocol): (1) persistent connections between the client and server and (2) pipelining of TCP connections. xe2x80x9cPersistentxe2x80x9d connection refers to the process of keeping a connection open for subsequent imminent requests (this is optional in HTTP 1.0 and the default in HTTP 1.1). Although persistent connections and pipelining can reduce latencies that accrue from the establishment of the TCP connection and slow start on subsequent consecutive requests to a server, the user will still experience these latencies on the first request issued to the server.
Another suggested solution has been the prefetching of documents. A prediction is made of what resources the user is likely to request and the resources are prefetched before the user actually initiates a request for the resources. The main down side to prefetching resources is the bandwidth requirement. Excessive prefetching may overload the user""s link to the Internet which is often low bandwidth, e.g. a telephone line. It also may overload the network and reduce the overall service quality.
Another technique has been the caching of documents at the user""s Web browser or at a cache shared by many users (i.e. proxy caching), a technique obviously limited by the size and content of the cache. Moreover, caching does not address the connection-setup RTT delay since a validation of the contents of the cache is still required. Another proposed technique has been reducing the TCP slow start by using a higher initial transmission rate. Although such an aggressive implementation of TCP can save on slow start induced delay, it can also cause user data to be lost by saturated routes.
The present invention relates to a novel technique for reducing user-perceived latency due to the time required to establish a connection to a server in a network. In accordance with the present invention, an open connection is established to a set of servers, there being some probability that the user will contact one of the servers in the near future. This is referred to as preconnecting or prefetching the connection. For example, in the context of a Web client-server network, a list of likely servers can be deduced from links on a current Web page a user is looking at or from a more sophisticated analysis of the user""s browsing habits. When the user requests a resource from one of the identified servers, the network connection has already been established, thereby reducing latency and improving service quality, especially for higher bandwidth clients for whom the delay is most noticeable. In contrast to conventional document prefetching, preconnecting does not hog network bandwidth or consume cache space, and hence can be used with much less scrutiny. Moreover, the technique can be implemented in Web browsers without protocol modifications or changes to Web server code.