In many computer networking environments, requests to be handled can be received from many different users at remote locations to handling devices. It is possible that the amount of incoming requests can exceed the bandwidth of the handling devices. As a result, requests can be handled in a less than optimal manner. For example, requests can take too long to be handled, or they may not be handled at all, as certain requests can be denied. A common strategy to solve this problem is to distribute the incoming requests to a number of servers, each with an independent ability to serve each request. However, it is still desirable to limit a single user from using more than his share of the system's resources. If the requests are distributed evenly to independent devices it can be difficult to gauge total impact on the system. Currently, there are not suitable rate limiting mechanisms to handle requests to such a distributed system without hindering the performance and/or scalability of the system itself.