1. Background and Relevant Art
Computer systems and related technology affect many aspects of society. Indeed, the computer system's ability to process information has transformed the way we live and work. Computer systems now commonly perform a host of tasks (e.g., word processing, scheduling, accounting, etc.) that prior to the advent of the computer system were performed manually. More recently, computer systems have been coupled to one another and to other electronic devices to form both wired and wireless computer networks over which the computer systems and other electronic devices can transfer electronic data. Accordingly, the performance of many computing tasks is distributed across a number of different computer systems and/or a number of different computing environments.
In some environments, computer systems operate in a cloud computing environment. In cloud computing environments, a cloud-service provider uses a common underlying physical network to host multiple customers' applications, sometimes referred to as “tenants”. A tenant can have a set of virtual machines (“VMs”) or application processes that is independently deployable and is solely owned by a single customer (i.e., subscription). Reachability isolation can be used to mitigate direct interference between tenants. However, reachability isolation is not sufficient, since a malicious or careless tenant can still interfere with other tenants in the network data plane by exchanging heavy traffic only among its own members (VMs).
Accordingly, other techniques can be used to attempt to isolate performance of tenants. Some techniques have relied on Transmission Control Protocol's (“TCP's”) congestion control. However, a tenant can essentially achieve unbounded utilization of a network by using many TCP flows (connections) and using variations of TCP. Tenants can also use other protocols, such as, for example, User Datagram Protocol (“UDP”) that do not respond to congestion control.
Trust of tenant networking stacks is also a problem.
Further, conventional in-network Quality of Service (“QoS”) mechanisms (e.g., separate queues with Weighted Fair Queuing (“WFQ”)) do not scale. These QoS mechanisms are also complicated and expensive to use for differentiating performance when tenants frequently join and leave. Statically throttling each VM on the sender side is inefficient and ineffective as it wastes any unused capacity and given a sufficient number of VMs, a tenant can always cause performance interference at virtually any static rate applied to each VM.
Accordingly, in cloud computing environments, due at least in part to one or more of these factors, it can be difficult to regulate network traffic in a way that reliably prevents disproportionate bandwidth consumption.