Large-scale networked systems are commonplace platforms employed in a variety of settings for running applications and maintaining data for business and operational functions. For instance, a data center (e.g., physical portion of a cloud-computing network) may provide a variety of services (e.g., web applications, email services, search engine services, etc.) for a plurality of customers simultaneously. These large-scale networked systems typically include a large number of resources distributed throughout the data center, in which each resource resembles a physical machine or a virtual machine (VM) running on a physical node or host. When the data center hosts multiple tenants (e.g., customer programs), these resources are optimally allocated from the same data center to the different tenants.
Often, multiple VMs will concurrently run on the same physical node within a computing network, or the data center. These VMs that share a common physical node may be allocated to the different tenants and may require different amounts of bandwidth at various times. These networks have limited compute, memory, disk, and network capacity and are typically heavily over-subscribed. When a network is under excessive load (e.g., during a DOS attack), data packets being transmitted via the network are dropped. Conventional network equipment (e.g., load balancer) randomly drops data packets to all VMs on the impacted network segment when under load, which leads to performance degradation of all network tenants. In a multitenant environment (e.g., cloud-computing network), this impacts tenants that are not associated with the increased load. This results in unfair distribution of network capacity within virtualized environments, where unrelated tenants share the physical resources of a single host and the network.
For example, today, the load balancer may begin randomly dropping data packets without any consideration of the destination or origination of those data packets. That is, the load balancer does not monitor or factor in which VMs are seeing a higher load than others when carrying out the randomized decrescendo of traffic. In order to more fairly allocate the impact of dropped data packets within the cloud-computing network, the present invention introduces technology for sharing available bandwidth between VMs by reprogramming load balancers within a cloud-computing network to monitor bandwidth usage and probabilistically discard data packets, when necessary, based on the network's overall stress level and/or the amount of traffic an individual VM is experiencing.