1. Field
This field is generally related to rate limiting.
2. Related Art
Servers provide application programming interfaces (APIs) available to software program code written by external developers, often over one or more networks such as the Internet. These servers may receive a request, such as a web service request, from a developer application and provide a response to the request. Example services available through network APIs include returning search results and translating between languages.
If developer applications send too many API requests, the API servers may be overwhelmed. For this reason, API service providers generally limit the rate at which a developer application can use an API. Verifying that an application's limit has not been exceeded generally must occur before an API request is processed. Thus, if this verification is not conducted rapidly, the API server's response time may be unacceptably slowed.
To enforce rate limits, many API service providers use token bucket algorithms. As an example, a token bucket algorithm may operate as follows to limit the number of requests processed each second. The token bucket algorithm maintains a bucket that includes between 0 and Y tokens. Every second, the algorithm may add X tokens (referred to herein as the replenish rate) back to the bucket, capping the number of tokens at Y. Every time a request is made, an attempt is made to remove a token from the bucket. If the bucket only includes zero tokens, the request is denied. The maximum request rate may be altered by changing the X and Y values. In this way, the token bucket algorithm may limit the number of requests from developer applications over a time period.