The Internet is the world's largest network, and it has become essential in organizations such as government, academia, small, medium and large businesses as well as to individual consumers. Many businesses have started out-sourcing their electronic business (“e-business”) and electronic commerce (“e-commerce”) Web sites to service providers instead of running their Web sites on their own server(s) and managing them by themselves.
Such a service provider needs to install a collection of servers (termed a Web Server Farm (WSF), Universal Server Farm (USF), or Web Server Cluster) which can be used by many different businesses to support their e-commerce and e-business. These business customers (e.g., the service provider's “customers”) have different “capacity” requirements for their Web sites. The users of e-commerce (consumers, business partners, etc.) access the Web Server Farm by first logging into the Internet. The Web Server Farm (WSF) is connected to the Internet via high speed communications links such as T3 and OCx links. These links are shared by all the Web sites, and all the users who are accessing the services hosted by the Web Server Farm.
When businesses out-source their e-commerce and/or e-business to a service provider, they must get some guarantee on the services they are getting from the service provider for their sites. Once the service provider makes a commitment to a customer to provide a certain “level” of service (e.g., termed a Service Level Agreement (SLA)), the provider must guarantee that level of service to that customer.
A general service level agreement (SLA) on communications link bandwidth usage for a customer can be denoted by a pair of bandwidth constraints: the minimum guaranteed bandwidth B(i, min) and the maximum bandwidth bound B(i, max) for each i-th customer.
The minimum (or min) bandwidth B(i, min) is a guaranteed bandwidth that the i-th customer will receive regardless of the bandwidth usage by other customers. The maximum (or max) bandwidth B(i, max) is an upper bound on the bandwidth that the i-th customer may receive provided that some unused bandwidth is available (e.g., bandwidth not being currently used by other customers).
Therefore, the range between B(i, min) and B(i, max) represents bandwidth provided on an “available” or “best-effort” basis to a customer, and it is not necessarily guaranteed that the customer will obtain this bandwidth.
Generally, the unit cost to use the bandwidth up to B(i, min) is less than or equal to the unit cost to use the bandwidth between B(i, min) and B(i, max). Such a unit cost assigned to one customer may differ from those assigned to other customers.
In the environment of Web site hosting, where communications link(s) between the Internet and a server farm is shared by a number of customers (i.e., traffic to/and from the customer Web sites are sharing the communications link(s)), the bandwidth management on the outbound link (i.e., the link from a server farm to the Internet) is more important than the bandwidth management on the inbound link since the amount of traffic on the outbound link is several orders of magnitude greater than that on the inbound link.
Furthermore, in most cases, the inbound traffic to the server farm is directly responsible for the outbound traffic generated by the server farm. Therefore, the service level agreements (B(i, min), B(i, max)) are generally applied on the outbound link bandwidth usage.
A conventional method for leaky bucket traffic shaping uses fair queuing collision arbitration. This method uses a set of queues and virtual finishing time for scheduling Asynchronous Transfer Mode (ATM) cell traffic. The method could be implemented to regulate packet traffic to enforce the minimum bandwidth SLA, B(i, min) to each customer.
Another conventional method for shaping traffic in a packet-switched network uses a set of queues and the conformance time to shape packet traffic. The method can be used to regulate packet traffic to enforce the minimum bandwidth SLA, B(i, min) to each customer.
Some commercial products (e.g., Xedia Access Point (www.Xedia.com) and Phasecom's SpeedClass (www.speed-demon.com) can be used to regulate directly the outbound traffic to meet with the (minimum, maximum) bandwidth SLA for each customer.
However, while the conventional systems and methods such as those mentioned above, could be reasonably applied to enforce service level agreements (SLAs) on the outbound link usage by each customer (and on customer traffic class or type), some can only support the minimum bandwidth SLA and some can support the (minimum, maximum) bandwidth SLA.
A major problem of these systems and methods is that they enforce the outbound bandwidth SLA by throttling (i.e., dropping some of) the traffic already generated by specific source (IP) addresses.
A major problem arises when some packets must be dropped because either the outbound link has been congested for a sustained period of time so that some queues are running out of space, and/or a particular customer's outbound link usage has been exceeding the maximum bandwidth SLA and its queue(s) has filled up. That is, when a queue has filled up, packets coming to that queue are randomly dropped. This means the packet-dropping affects many responses from a server farm since each response consists of many packets. This dropping of packets triggers many Transmission Control Protocol (TCP) retransmission, which leads to even further retransmission of packets for error recovery, causing thrashing and then eventually leading to a slowdown of many connections or disconnection of connections.
Another problem of dropping outbound packets is wasting server resources in generating responses which cannot be delivered to end users (i.e., servers are kept busy for generating undeliverable responses, yet they could have been used for generating responses for customers whose outbound bandwidth usage are below the minimum guarantees).
Yet another problem is, classification of outbound traffic is limited since the outbound packet does not say for which type request the packet was generated. This will limit the degree to which differentiated services could be applied in controlling bandwidth usage.