Some existing systems deploy web servers as virtual machines (VMs) running on a single physical host. The host may have one, or a few, front-end virtual load balancers that operate to distribute traffic among the web servers so that no single server becomes a performance bottleneck. Operation of the load balancers is affected by surges in traffic loads and changing conditions of the web servers. For example, in a virtual data center, the VMs implementing the web servers are frequently migrated across physical hosts (e.g., due to address failures, congestion, or load rebalancing).
Live migration of VMs is a form of migration that allows high availability, transparent mobility, and consolidation. With live migration, running VMs are moved between hosts without disconnecting services executing inside the VMs. With existing systems, however, the load balancers are unaware of the underlying migration states of the VMs. For example, the load balancers treat the VMs undergoing migration no differently from the VMs that are not being migrated. As a result, the load balancers continuously dispatch new connections to the VMs being migrated introducing performance challenges into the migration. Exemplary performance challenges include, but are not limited to, an increase in memory to copy during a pre-copy phase of migration, memory-related bookkeeping, and additional downtime for saving and restoring non-memory states.
Some of the existing load balancing algorithms track web server status (e.g., server load, connection count, response time, etc.) by, for example, passively probing the web servers. This technique, however, only detects performance degradation after the performance degradation has occurred. As such, it is difficult for existing load balancers to respond (e.g., redirect traffic elsewhere) to short resource spike usage scenarios introduced by migration of VMs.