As an ever-increasing number of applications and resources are being provided electronically, typically over networks such as the Internet, there is a corresponding increase in the number, types, and sources of requests received to various content providers. In many cases, different types of user will access similar resources provided by a common content provider. When one of these users sends an excessive number of queries, or queries that are very computationally expensive, the performance of the system providing access to the resources can be degraded for other users.
Conventional systems attempt to minimize the impact that one user can have on other users of a resource by throttling the number of requests that a user can submit over a specified period of time. In some cases, a user can get around this limit by running multiple instances. Even if a user cannot get around this limit, the queries submitted might be very computationally expensive, such that the user may be abusing the system even when the user is within the allowed number of requests.
Some conventional systems introduce queues of differing priority to provide levels of processing, and attempt to apply rules and policies to the received requests. Such an approach introduces latency to every request, however, and typically still relies upon static limits and determinations.