A computer network system has one or more host network servers connected to serve data to one or more client computers over a network. FIG. 1 shows a simple computer network system 20 with a single host network server 22 connected to multiple clients 24(1), 24(2), . . . , 24(N) via a network 26. The clients 24(1)–24(N) send requests for data and/or services to the server 22 over the network 26. For discussion purposes, suppose the server 22 is configured as an Internet service provider, or “ISP”. The ISP server 22 provides an email service 28 that handles electronic mail messages over the Internet 26 and a web service 30 that supports a web site accessible by the clients.
The network 26 is a medium with a predefined bandwidth capacity that is shared among the clients 24(1)–24(N). The network 26 is represented in FIG. 1 as a network pipeline to indicate a finite bandwidth capacity. The network 26 is representative of different network technologies (e.g., Ethernet, satellite, modem-based, etc.) and different configurations, including a LAN (local area network), a WAN (wide area network), and the Internet. The bandwidth capacity depends on the technology and configuration employed. For this example, suppose the network 26 has a total bandwidth capacity of 1,000 kilobits per second (Kb/s). Given this fixed bandwidth, the ISP administrator can allocate portions of the bandwidth for the various services 28 and 30. For instance, the ISP administrator might allocate 400 Kb/s to the email service 28 and 600 Kb/s to the web service 30.
As the clients 24(1)–24(N) access the services 28 and 30, they consume bandwidth on the network 26. The responses from the host server 22 also consumer bandwidth. When the allocated bandwidth for a service becomes saturated with client requests and server responses (such as the web service when bandwidth consumption reaches 600 Kb/s), some of the requests are either delayed in transmission or not delivered to the intended destination. Therefore, some form of request throttling mechanism is necessary to minimize network congestion and efficiently utilize the allocated network bandwidth.
In the case of multiple network servers or services executing on a single host computer system and sharing a fixed bandwidth communication link to the network, some network servers can disproportionately allocate this network bandwidth to their tasks, thereby excluding other concurrently executing network servers from performing their requested operations. In this case, the bandwidth throttling must be effected among the plurality of network severs which are concurrently executing on the host computer system.
It is therefore a problem to allocate bandwidth to the network server processes in a manner which enables the maximum number of requests to be served without network congestion and to also avoid impacting other network servers which may be executing on the same host computer system.
There have been many implementations of bandwidth allocation and congestion control schemes to address this problem. U.S. Pat. No. 4,914,650 discloses an integrated voice and data network which includes a multiplexer which functions to connect the host computer system with the network. The multiplexer is equipped with a voice queue for storing voice packets and a data queue for storing data packets. Both the voice packets and the data packets are transmitted uninterrupted for a respective predetermined interval, whose respective durations may be different. Signaling messages which are exchanged among the computer systems via the network preempt the voice and data transmissions to ensure that signaling messages are serviced with very low delay and zero packet loss in addition, the bandwidth allocated for each type of transmission, if unused, can be momentarily allocated to the other type of transmission to maintain a high level of service.
U.S. Pat. No. 5,313,454 discloses a feedback control system for congestion prevention in a packet switching network. Congestion control is achieved by controlling the transmission rate of bursty traffic when delay sensitive data is present for transmission. The bursty data is relatively insensitive to delay and can be queued for a reasonable period of time. Data indicative of the queue length is broadcast via the network to the destination node where it is processed and a control signal returned to the originating node to regulate the rate of transmission of the bursty data.
U.S. Pat. No. 5,359,320 discloses a scheduling mechanism for a network arbitration circuit in a broadcast network environment. The scheduling mechanism delays the arbitration circuit from seeking access to the network if the network traffic exceeds a first predetermined threshold and the local traffic in the node exceeds a second predetermined threshold. This scheduling mechanism therefore responds to both local and global congestion to throttle the production of new requests.
U.S. Pat. No. 5,432,787 discloses a packet switching system which appends a parity packet to each predetermined number of data packets. The number of data packets which are transmitted before the parity packet is appended thereto is a function of the network traffic and the measured network error rate.
U.S. Pat. No. 5,477,542 discloses a packet switching network which interconnects a plurality of terminal stations for transmitting video and voice data packets. The terminal stations which are operating in the receive mode transmit control signals to the associated transmitting terminal stations to indicate the amount of delay that the received packets have experienced in traversing the network. If the delay exceeds a predetermined threshold, the video packets are delayed and the voice packets are preferentially transmitted, since the voice packets are more sensitive to transmission delays.
Thus, there are numerous existing network congestion control mechanisms available to regulate the transmission rate of data through a network. However, the common thread in all of these systems is that a single control mechanism is provided to effect the desired congestion control. These control schemes are typically binary in nature, being either active or disabled. There is presently no known hierarchical network congestion control system which differentially responds to various levels of congestion. Furthermore, these congestion control schemes operate without regard for the nature of the processes that are extant on the network servers.
Additionally, FIG. 2 shows an example in which the ISP 22 supports multiple domains 32(1)–32(M) on the same web service 30. For instance, it is not uncommon for an ISP to support thousands of domains on the same web service. To the client, however, each domain functions as its own service as if running on its own HTTP (Hypertext Transfer Protocol) server on its own machine. Hence, the ISP 22 is effectively running multiple “virtual services” on multiple “virtual” HTTP servers, all from the same web service on the same machine. In such cases, network bandwidth control cannot be limited to applying globally to all the virtual servers. The all-or-nothing approach is unacceptable because the administrator often desires to designate some virtual services as more or less critical than others.