This disclosure relates to the field of computer systems. More particularly, a system, apparatus, and methods are provided for setting and observing the statuses of quotas on computing resources.
Computer resources such as memory, storage, communication bandwidth, processor time, and others, are finite, even in the largest data centers and most robust computer systems. Demand exceeds capacity in many data centers and other computing environments, thereby requiring consumers of the resources to share them.
Some techniques for sharing computer resources implement quotas to ensure that individual consumers and/or collections of consumers get their share, but no more than their share. Accurate and fair application of such quotas, however, requires accurate and appropriate identification of the consumers.
For example, in a naïve application of computer resource quotas, consumption or use of a resource may be limited according to a non-specific identifier such as an IP (Internet Protocol) address or a user identifier. There may be a one-to-one correspondence between resource consumers and IP addresses or user identifiers in some environments, but in others multiple consumers may share an IP address (or other identifier), which will unfairly require the multiple consumers to share one quota instead of each receiving its own quota.
Further, in some environments, multiple entities that control access to a resource apply quotas in a collective manner. For example, each of the controller entities may report to a central entity regarding usage of the resource by clients. The central entity collects the usage statistics, and disseminates the statistics and/or quota statuses to all controller entities so that they all act in unison to deny access to clients that have exceeded their quotas. In these schemes, there is measurable delay in correctly enforcing a given requester's quota (either to deny requests when the quota is exceeded or to once again accept requests when the quota is no longer exceeded), because of the communication overhead involved in collecting and disseminating quota data. Further, based on the collective information, a given controller entity may be forced to continue to accept and process resource requests even when it is overloaded locally (i.e., because the requesters have not violated their globally applied quotas).
Also, some quota systems are configured to only enforce quotas relating to access to a resource. Such a system may set a maximum rate at which a client may submit queries to a data repository, for example. Enforcing such a quota may help prevent one client from overloading the resource with requests, but may not actually prevent the resource from being overloaded with processing. For example, a client may submit a number of queries that do not trigger the quota, but that require a great deal of processing by the resource—such as accessing a large portion of the stored data in order to satisfy the queries. Thus, a system that only restricts a rate at which a resource may be accessed may be unsuccessful in keeping the resource from being overworked.
In addition, rate-limiting quotas are generally hard-coded into the interfaces (e.g., APIs or Application Program Interfaces) through which a client (e.g., an application, a service, some other logic module) accesses a resource. This strategy makes it difficult to adjust a quota regarding the quota for a selected client. In addition, there is generally no way to view, in real-time, the status of a given quota regarding a given resource for a given client, to determine how close it is to exceeding the quota, to see whether and how often the quota was enforced, to compare a current status of the quota to a past status, etc.