In recent years, Web Services is a popular technology for companies to provide services APIs (“application programming interfaces”) to integrators and developers to develop products using a company's services APIs over the Internet. This allows the company to open up opportunities for outside developers to innovate with and within the company's services. This provides benefits to both the customers and the company. For example, a wireless carrier can provide messaging web services for developers. The developers can then develop messages related products, which encourages more message usage among the subscribers. The customers benefit from a new product, and the company benefits from the increased usage and updated features it did not have to spend the time and resources to develop.
A web services client is typically a software program that makes API calls to the web services servers. Unlike the regular World Wide Web, where users interact with web servers via manual clicks in the web browsers, a web service client can submit multiple requests to web services servers simultaneously and continuously. Because the capacity of web services servers is limited, too many simultaneous requests from the client may overload the system. System overload causes the system to return slow responses to requests, or even stop functioning. This imposes a challenge to the web services servers: how to ensure quality of service to majority of the clients when a few clients are sending too many requests? One method is throttling.
Throttling is a mechanism used to limit the number of requests to the web service either by refusing or delaying requests in a specified time interval to provide better quality of service. A few examples of throttling policy include concurrent, idle, request and volume. Concurrent throttling policy imposes a limit on the number of concurrent requests at any one time. Idle throttling policy imposes a minimum idle time between requests. Request throttling policy imposes a limit on the number of requests per period. Volume throttling policy imposes a limit on the volume, such as kilobytes, sent per period. However, these mechanisms are typically implemented on a server by server basis within a single group of servers (or “farm”). Thus, each server in the farm can reach the throttling limit when inundated with many requests from the same client. This will affect the farm as a whole and slow the responses from all servers involved.
Hence a need exists for the web services servers to communicate with one another to learn about the total number of requests among all the servers in order to enforce the limit of requests for each client, to reduce or prevent all servers in a farm from being adversely impacted by a large amount of requests.