1. Field of the Invention
The present invention relates generally to world-wide networks, and more particularly to the global Internet and Internet World Wide Web (WWW) sites of various owners that are hosted by a service provider using a cluster of servers that are intended to meet established service levels.
2. Description of the Prior Art
The Internet is the world's largest network, and it has become essential to academia and many small, medium and large businesses, as well as to individual consumers. Many businesses have started outsourcing their application (business and commerce) processing to service providers instead of running and maintaining application software on their own server(s). These service providers are known as Application Service Providers (ASP). Each ASP installs a collection of servers (termed a server farm) which can be used to run many different business applications for various customers. These customers (i.e., the service provider's “customers” who are often called “hosted” customers) have different “workload” requirements for their applications. The ASP's must ensure that their server farms can handle the various workload requirements of their hosted customers' applications.
When businesses out-source their business applications to a service provide, they typically obtain a guarantee on the services they will receive from the service provider for their applications. Once the service provider makes a commitment to a customer to provide a certain “level” of service (e.g., a Service Level Agreement (SLA)), the provider must guarantee that level of service to that customer. The incoming traffic (e.g., Internet Protocol (IP) packets) from the service provider's customers to a server farm can be classified into various classes/types by examining the packet destination address and the Transmission Control Protocol (TCP) port number. A general SLA on an application workload to a server farm can be denoted by a pair of TCP connection rates: the minimum TCP connection rate Nmin(i,j) and the maximum TCP connection rate Nmax(i,j) for the ith customer's jth application. The minimum (or min) TCP connection rate Nmin(i,j) is a guaranteed TCP connection rate that the ith customer's jth application will be supported by the server farm regardless of the server farm's usage by other customers' applications. In other words, the service provider guarantees that TCP connection requests associated with a given customer for a given application will be admitted to the server farm as long as Nmin(i,j) is not exceeded. The maximum (or max) TCP connection rate Nmax(i,j) is an upper bound on the TCP connection rate that the ith customer's jth application may be supported by the server farm provided that some additional “sharable capacity” allocated for handling the jth application is available. Such sharable capacity may be available because some “excess” capacity has been allocated for the jth application by the server farm operator and/or because some “unused capacity” is available due to some customer's jth applications are not using their allocated minimum TCP connection capacities. Therefore, the range between Nmin(i,j) and Nmax(i,j) represents the TCP connections that are supported on “best-effort” basis, and it is not necessarily guaranteed that a customer's TCP connection request will be admitted beyond the guaranteed minimum Nmin(i,j). Generally, the unit cost charged per TCP connection beyond the minimum Nmin(i,j) is more than the unit cost charged per TCP connection below Nmin(i,j). Such a unit cost assigned to one customer may differ from those assigned to other customers.
Some commercial products (e.g., the Access Point (AP) products from Lucent/Xedia (www.xedia.com), and the Speed-Class products from PhaseCom (www.speed-demon.com)) can be used to “shape” the inbound traffic (admitted bits per second into a server farm) to meet the (minimum, maximum) bandwidth usage-based SLA for each customer and for each customer's application. Unfortunately, however, the amount of bits coming into the server farm does not necessarily represent the workload requirements as represented by the number of TCP connection requests. U.S. patent application Ser. Nos. 09/506,603 and 09/543,207, commonly assigned with the present invention, teach systems and methods for meeting outbound bandwidth usage-based SLA's by regulating inbound traffic to a server farm. However, their systems do not address the problem of how to support (Nmin(i,j),Nmax(i,j) TCP connection request-based SLA's.
Accordingly, what is need is a system and method for meeting SLA's for application workloads to a server farm based on TCP connection requests, as opposed to meeting SLA's based on the number of bits coming into the server farm.