As an increasing number of applications and services are being made available over networks such as the Internet, an increasing number of content, application, and/or service providers are turning to technologies, such as cloud computing, that enable multiple users to share electronic resources. Access to these electronic resources is often provided through services, such as Web services, where the hardware and/or software used to support those services is dynamically scalable to meet the needs of the services at any given time. A user or customer typically will rent, lease, or otherwise pay for access to resources through the cloud, and thus does not have to purchase and maintain the hardware and/or software to provide access to these resources.
In at least some cloud computing environments, certain customers are provided with guaranteed levels of service. A cloud provider must then provide enough electronic resources to support these guarantees. Since it is uncommon for all the customers to be using their full guaranteed rates at all times, the resources can be underutilized. Further, certain customers might want to utilize more than their guarantees which can prevent other customers from reaching their guarantees, or can at least slow down the system. One conventional approach to solving this problem is to throttle customers when those customers attempt to exceed an allocated rate. While such an approach can guarantee resource availability, throttling can still result in underutilized resources that could be utilized by customers to perform more work while the resources do not have a full workload.