Several important computer technologies rely, to a great extent, upon rapid delivery of information from a central storage location to remote devices. For example, in the client/server model of computing, one or more servers are used to store information. Client computers or processes are separated from the servers and are connected to the servers using a network. The clients request information from one of the servers by providing a network address of the information. The server locates the information and transmits it over the network to the client, completing the transaction.
The World Wide Web is a popular application of the client/server computing model. A client, such as a computer or a software process such as a browser program, is connected to a global information network called the Internet, either directly or through an intermediary such as an Internet Service Provider, or an online information service. A server is likewise connected to the Internet. The client and server communicate using one or more agreed-upon protocols that specify the format of the information that is communicated. The server has a server name in an agreed-upon format that is indexed at a Domain Name Server (DNS). The client looks up the name of the server at the DNS and establishes a connection to the server using a communication protocol called the Hypertext Transfer Protocol (HTTP). A Uniform Resource Locator (URL) uniquely identifies each page of information stored on the server. A URL is a form of network address that identifies the location of information stored in a network. The logical path that connects a client to a server is called a connection. In practice, a connection is a set of data values that identify a hardware port, buffers, and storage areas that are dedicated to a particular path between client and server. A server can have many logical connections open and active at a given time.
In these and other contexts, a key factor that limits the performance of network communications among devices is the efficiency with which a central server can communicate information to a client. In a networked environment, it is common for different clients to connect to a single server using connections that have different data communication rates. For example, in a particular network or application a server can be connected to a first client by an Ethernet link that operates at 10 megabits per second (Mbps), to a second client by a modern link that operates at 28,800 bits per second (28.8 Kbps), and to a third client by an ISDN link having one or two 64 Kbps channels. In such case, matching the data communication speed of each connection to the input/output processing speed of a computer system is difficult.
When buffered data communications is used, fast memory mechanisms called buffers are interposed between the connections and the computer system. The connections fill the buffers with data at a rate proportional to the communication speed of the connection. Ideally, the system removes data from the buffers at the identical rate at which the buffers are filled, but in past approaches this has been impossible. A typical computer system draws data out of the buffers usually at a much higher speed. Because the computer system can draw data from the buffers far faster than the connection can fill the buffers, the computer system draws data from the buffers only periodically. The computer system performs other operations while waiting for the buffers to fill to an extent that makes the processing cost of accessing the buffers worthwhile. Generally, an efficient system removes data from the buffers at a rate that ensures that the buffers never become full. It is highly undesirable for data to sit idle in the buffers.
A number of past approaches have addressed this problem. In one prior approach, the server polls each of the connections one after another at a slow rate to minimize overhead. “Polling” means to examine the connection or a buffer associated with it, determine whether a packet of data has arrived or needs to be sent, and communicate the packet of data. Generally polling is carried out 5 to 100 times per second. The time interval that separates polls is fixed in two ways. First, the server always moves from one connection to the next in the same time interval. Second, the time between successive polls to the same connection is the same. The server uses a slow poll rate in order to reduce the total number of poll operations that are carried out. This approach provides low overhead, but adds latency and causes fast connections to suffer poor performance.
A second approach is to poll all connections quickly. This works well for fast connections, but imposes high overhead for slow connections. It also limits the overall number of connections that a single server or machine can manage. In particular, it is relatively expensive in terms of processing steps to check a connection that has no data. The check causes wasted processing steps or overhead.
The first and second approaches share a significant limitation. When the data connections have widely varying data communication rates, fast connections are not polled often enough, and slow connections are polled too often. This limitation is difficult to address because, generally, there is no way for the server to determine the data communication speed of a connection before the connection is established, or before data communications have actually occurred over the connection.
Another approach is to respond to each data packet as it arrives. An interrupt can be generated when data packets arrive and the server can respond to the interrupt. This approach is highly responsive, but in the World Wide Web context it is impractical, because in a single HTTP connection there are typically many packets. When each packet arrives, an interrupt is generated, and the system must save its current state, call an interrupt handler, process the packet, and return to the original state. This imposes very high overhead.
Thus, there is a need for a way to adjust the time interval between successive polls of a connection so that the time interval closely matches an ideal value that is related to the actual bandwidth of the connection.
There is also a need to provide a way for the server to adapt its polling behavior to each data connection among numerous connections that have widely differing data communication rates.