When applications are in an overload situation, the service levels of all users of that application are typically not met. Examples of applications that accept requests and may enter into an overload situation due to a high rate of requests include a database application, a web server application, a financial transaction application, enterprise application, business intelligence application, etc. These type of applications often have service levels associated therewith, such as response time. The service levels may be specified in service level agreement (SLAs), and a service provider, for example, hosting the applications, may be subject to penalties if the service levels are not met. When these applications are in an overload situation the service levels may not be met.
Request throttling is a common mechanism used to mitigate application overload situations by limiting the rate at which new requests are accepted. Request throttling can be effective; however, there are several cases where throttling only newly received requests might lead to long recoveries from an overload situation. For example, if there are long running requests, it would take a long time for them to finish, and so long as they are executing, the overload situation persists. This reduces the efficiency of throttling-based mechanisms as their effects are not observed until a sufficient number of long-running requests complete.