In a cloud environment, many resources (e.g., servers, data repositories, etc.) are coupled together. A cloud service may be described as providing functionality via an application executing in the cloud environment (e.g., on a server in the cloud environment).
In such a cloud environment, cloud service providers rely on economies of scale, such as shared servers, shared services, many customers, and many users to drive profits. Customers typically expect the providers deliver a high quality of service and minimize downtime and the loss of functionality. In many cases, downtime is related to resource contention, Denial of Service (DoS) attacks, poor performing application code, and server crashes.
Cloud service providers may monitor and throttle requests, using API management techniques, so that the cloud service provider controls the flow of requests to the cloud service being provided. Throttling requests may include slowing down requests, redirecting requests or stopping requests. For example, a cloud service provider may monitor a number of requests within a time interval, which is used to throttle requests to maintain resource availability, avoid crashes, and maintain a consistent service level. The monitored number of requests is compared against a set limit of requests per the service (global), client application, user details (Internet Protocol (IP) address, security principal) and/or organization.
The most common form of throttling involves the cloud service provider monitoring the IP address of a client application or user's Hypertext Transfer Protocol (HTTP) requests. If the request from one IP exceeds a threshold, a subsequent request from the same IP address is throttled. The threshold may be in terms of a number of requests allowed per unit time (e.g., 30 requests per sec).
Another technique for throttling requires a separate infrastructure to query servers for their health and throttle based on that. However, such queries burden the servers by requiring them to calculate available resources with each such query. If the frequency of such queries is low, route management may be ineffective for real time scenarios. If the frequency of such requests is high, that would potentially bring down an under pressure server.
With the changing landscape of computing and infrastructure, cloud service providers are forced to constantly change and update the threshold to match the changing capabilities of the provided APIs. The changes make it difficult to set a threshold limit up front, since the cloud service provider's capabilities/parameters (e.g., computing power, efficiency, workloads and total number of customers) change.