Web services and cloud computing are deployed in an unprecedented pace. New servers are unloaded and installed at data centers every day. Demands of web services and corporate computing come from all directions. Consumer oriented services include smartphone apps, mobile applications such as location based services, turn-by-turn navigation services, e-book services such as Kindle™, video applications such as YouTube™ or Hulu™, music applications such as Pandora™ or iTunes™, Internet television services such as Netflix™, and many other fast growing consumer Web services. On the corporate front, cloud computing based services such as Google™ docs, Microsoft™ Office Live and Sharepoint™ software, Salesforce.com™'s on-line software services, tele-presence and web conferencing services, and many other corporate cloud computing services.
As a result more and more servers are deployed to accommodate the increasing computing needs. Traditionally these servers are managed by a service gateway such as Application Delivery Controller or Server Load Balancer (ADC/SLB) are typically network appliances in a fixed module or in a chassis or a software module running in a commoditized server ADC/SLB manage the application traffic to servers based on incoming service requests. Common methods to distribute traffic among servers is to distribute the service requests based on the applications (HTTP, FTP, HTTPS etc.), service addresses such as URL, priorities based on network interfaces or host IP addresses. ADC/SLB may distribute the service requests to a server assuming the server is fully available to handle the service requests. Typically, a fully loaded server does not handle service requests well. In fact, most if not all service requests suffer delay or no service available when a server is busy. It is often better not to further distribute service request to a busy server.
Current ADC/SLB allows a network administrator to set a maximum service session capacity so that ADC/SLB does not send more than the maximum capacity service requests to the server. However, statically configured limitation on a server cannot fully utilize the server's capacity and not all service requests require the same processing from the server. It is beneficial for an ADC/SLB to determine if a server is busy based on the service response time from a server such that the ADC/SLB can reduce sending further service requests to the server.
Therefore, there is a need for a system and method for an ADC/SLB to protect a server overloading based on dynamic service response time.